Project / Research Data Collections Project
Title / Interview template
Version / 1.0
Date effective / 5 November 2010
Last updated / 5 November 2010
Scope note / Contains the questions asked in the research data interview along with usage guidelines. The first page contains a table of the fields which may be pre-populated by information from research administrative systems. In addition it has a helpful ‘setting the scene’ introduction.
Authorship / Research Data Collections Project Team
Contact / URL: http://www.researchdata.monash.edu/collections-project/
Email:
Licensing /
This document is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.
Project acknowledgement / This project is supported by the Australian National Data Service (ANDS) through the National Collaborative Research Infrastructure Strategy Program and by Monash University Library.

1

Research Data Collections Project Interview Template

Reference number (internal)
Interview location
Time/Date:
Name of interview subjects / ·  Title + Given names + PCI surname
Campus / ·  Main campus
School / ·  Department
Faculty / ·  Faculty
Project title / ·  Project title
Category (of research) / ·  Category (of research)
Co-creators / ·  Investigators
RM project No / ·  RM project No
Grantor code / ·  Grantor code
Fund source / ·  Fund source
FoR code (6 digits) / ·  FoR code assignments
Project funding last received / ·  Year received
Final report due / ·  Final report due
Ethics
NLA identifier
Creator on ARROW
Project Abstract


Setting the scene

The goal of our project is to showcase Monash University research datasets and data collections through a new website called Research Data Australia.

We can look at this online : http://services.ands.org.au/home/orca/rda/

Monash University Library is participating in this project because evidence is emerging that research data that can be discovered, cited and re-used raises the profile of researchers and can create new collaboration opportunities. We are also hoping this process will start preparing researchers for changes to the funding guidelines from ARC and NHMRC and other funding agencies, which increasingly encourage grant awardees to assess the future value of the data they are generating and consider depositing research data into trusted repositories and data stores.

You have a number of projects which have received public funding and we were hoping your research may have generated one or more data sets or a data collections that could be showcased. We are hoping to gather some information from you. If you are willing, one possibility is to use this information to create a catalogue record within the Monash University ARROW Repository. Once we have done that, the collection is automatically registered with Research Data Australia.

Before we get started, we want to reassure you that data collections that are registered with Research Data Australia do NOT have to be publicly accessible via the Internet.

We understand that many researchers have concerns about providing access to their data. We also recognise that legal, ethical, and commercial restrictions apply to many data collections, and that access could require a password, direct contact with the researcher, and in some cases even applying to the ethics committee or the governing group for the project.

Later in the interview we will talk to you about some different levels of access and will consider legal, ethical and commercial restrictions. If you ARE willing and able to share your data, some of the information that we gather today can be used to assess the suitability of your data for deposit into trusted Monash University repositories or data stores.

The interview should take about 45 minutes. Do you have any questions before we get started?

14

November 2010

Questions are provided with usage guidelines.

1.  Overview : what data was involved, who collected or created it, and how it was obtained and used

1.1 Can you tell us briefly about the goals of a research project which would have a data set or data collection?
Provides basis for abstract about the data collection / Description
1.2.1 What types of data did the project involve?
[If multiple types and/or formats, establish is there more than one collection or sub-collection before proceeding. If there are multiple collections or sub-collections, may need to go through some of the questions more than once.]
Examples may include survey, questionnaire, interview transcript, bibliography, index, annotations, statistical data and analyses, sound recordings, videorecordings, measurements, images, fieldwork notes. / Type
1.2.2 What formats were used for the data in the project?
Examples may include doc, xls, xlsx, jpeg, SPSS, html, txt, csv etc. If the file format is proprietary, indicate the program and whether the software is readily available. / File Formats
1.3 Is the data referred to by a formal name? Title or alternative title e.g. subtitle
[e.g. Interferome: The Database of IFN Regulated Genes; Australia and New Zealand Jewish Population Survey; Restoration Theatre Song Archive]
Record a specific title rather than a generic title. Spell out acronyms (also need full title) and record short versions of long titles. / Title
Yes:
·  What is the full title?
·  Are there also other ways that the data is referred to? / No:
·  What would be a suitable working title for the data?
Prompt for variant of project name if necessary / Title
Alternative Title
Acronyms
1.4 Who collected or created the data during the project?
Validate names, roles, faculties/departments from prepopulated list of researchers above including any co-resesearcher details. If an NLA identifier exists, confirm with researcher. e.g. http://nla.gov.au/nla...... / Creators / Contributors
Roles / Affiliations
Related Object – Party
1.5 Was all the data that you used newly created or collected, or did you source data from elsewhere?
Many projects will have a mixture. / Copyright / Ownership
For new data
·  How was the data collected or created?
·  Was any special equipment or software involved? / For existing data
·  Who provided the data?
·  How was the data provided to you?
·  Did you derive new data from the source data?
·  What were the terms and conditions associated with the use of the data and any derived data?
May need to address this later. / Hardware and Software Requirements
Source Data
Copyright / Ownership
Potential Re-use
1.6 How was the data processed or analysed for this project?
Was the quantitative data entered into Excel/SPSS and analysed? Were interviewees de- identified? / Hardware and Software Requirements
File Formats
File Dependencies
1.7 Was the process of obtaining and using the data reviewed by a Human Ethics committee or other body?
Simply note for now. This is addressed in more detail later in the interview under Access Conditions. / Ethics – Privacy / Confidentiality / Consent
1.8 Is the data or the process you used to obtain or analyse it, commercially sensitive in any way?
Simply note for now. This is addressed in more detail later in the interview under Access Conditions. / Commercial-in-Confidence

2.  Timeline. Only apply this to data that has been collected or created, not third party data that has only been analysed.

2.1 When did the data start being collected? / Creation Date
Start Date
2.2 Are additions or amendments to the collection still being made? / State of Completion
If active:
·  When was it last added to or amended?
·  How frequently is it being updated?
·  When will it be complete? / If static:
·  When was the data last updated? / State of Completion
Date Last Updated
End Date
Updating Frequency
2.3 How long does the data need to be retained for? Standard retention periods are:
  5 years - Standard retention period
  7 years - Psychological testing or intervention with adults
  15 years - Medical research involving clinical trials
  25 years after date of birth of participants - Psychological testing or intervention with children
If researcher is not sure, may need to prompt using standard retention periods above. / Retention Plan
2.4 Will the data be destroyed at the end of the retention period, or is it more likely to be retained because it has longer term value?
Prompt, if necessary, with common scenarios, such as controversial; wide public interest; uses an innovative technique for the first time; shifts the paradigm in this field of inquiry; costly or impossible to reproduce. / Disposal Plan
If data will be destroyed:
·  Who will be responsible for making or reviewing the destruction decision? / If the data has longer term value:
·  Who will be responsible for making decisions about the data in the long term?
·  How might the data be used in the future, by you / the research team / others? / Disposal Plan
Contact Person / Asset Manager
Affiliation

3.  Scholarly content : what the data covers

3.1 Does the data relate to a particular time period?
Dates/times or textual equivalent. / Temporal Coverage
3.2 Does the data relate to a particular place?
Coordinates or textual equivalent. / Geospatial Coverage
3.3 Do you know the Field of Research Codes that relate to the data.
Validate from the list of FoR codes above. / Field of Research
3.4 Have you catalogued, coded or ‘tagged’ or described the content of the collection in any way?
If Yes:
Can you tell us more about how you have done this?
Was an existing list of discipline related keywords used?
Has it been described according to a standard? / If No:
Can you think of words or phrases that are different from the Field of Research Codes that might help someone that was searching for your data?
For example, keywords that are specific for your discipline?
Can you suggest keywords? Is there a thesaurus available? Common examples are: APAIS,MESH, ATED? / Keywords / Subjects
Metadata Available
Metadata Specification
3.5 Is there any other information or documentation you can provide about the content of the data collection?
Examples may include surveys, notebooks, manuals,spreadsheets etc.How do you record analysis? Is the accompanying documentation print or electronic? / Documentation

4.  Physical characteristics: extent of the collection and how things are organised

4.1 How many items are there?
For example 100 interview sound recordings with transcripts or 1500 digital images of tissue samples. / Number of Objects
4.2 What would a typical item be in terms of size? What would the largest item be in terms of size?
Record approximation of individual file sizes. / Object Size
4.3 What is the overall size of the collection?
Where possible record the amount of disk space. / Collection Size
4.4 What approach did you take to organising or structuring the collection?
Are the files in a folder structure that must be retained? Do they all have unique file names? Is it organised according to a standard? Are the files in a database or other stand alone application? Are these files digital or in paper form? Cross reference with 1.2.2 above. / Collection Structure
Metadata Available
Metadata Specifications
4.5 How have you approached naming or numbering the items within the collection?
·  e.g. each item has an automatic number in the database
·  e.g. each file has a name
If each file has a name, has this name been constructed using any type of local system or type of standard? / Collection Structure
Metadata Available
Metadata Specifications
4.6 How important is the way the collection is organised or structured, in relation to interpreting the data?
Do we need to leave it as you have organised it? Is any user likely to use individual files with out needing other files?
For example a web site with data embedded. / File Dependencies
4.7 Is there any other documentation available about the way the collection is organised and structured?
If there is, can a copy be provided? / Documentation

5.  Where the data is located

5.1 Where is the master copy of the data located now?
Is the data held locally or on a secure university server? / Physical Address
Electronic Address
Secure Storage
Backup / Recovery
Identifiers / Reference Numbers
5.2 Is the master copy secure in its current location? / Secure Storage
5.3 Are back-up copies stored in another location? / Backup / Recovery
5.4 Who has access to the data in its current location? / Access Conditions
5.5 Has a copy of the data itself ever been published, in a journal or a data archive or repository?
This is about the data being published in its own right. It is not about published articles that refer to the data. (See 6.2 below) / Access Conditions
If Yes
·  Do you have a reference number or citation for the publication?
·  Any other details about the publication available?
If data is already accessible in a durable published form, we might just link to it rather than ingest it. This might make some later questions redundant. / If No
Have you considered publishing the data in a repository or data archive? / Identifiers / Reference Numbers

6.  How this data relates to other materials

6.1 What other data collections and projects, if any, are related to this data collection?
Record DOIs (or other form of permanent identifiers). If not available, textual equivalent.
Would you have used the same data for more than one project? If yes, please record project details. / Related Information – Data Collections
6.2 What publications, if any, are associated with this data collection?
This is about publications that refer to the data or arose from analysis of this data.
An example would be a journal article that includes findings that refer to the raw data. / Related Information – Publications
6.3 Can you suggest other people, projects, organisations or information associated with the data that we have not already covered?
Information gathered will form the basis of Parties and Activities records. Record names of any other projects associated with the data. / Related Information – Parties and Activities

7.  Providing access to the data

For this project, data does not have to be publicly available for a collections record to be registered on Research Data Australia. It is possible to link to data that is access-controlled.