Postgraduate Data Management Plan

1.  Overview

1.1 Postgraduate Researcher:
1.2 Project title:
1.3 Project start and end dates:
1.4 Project context:
[The text in grey gives examples of possible answers — use or replace it as needed]
The project aims to investigate…(2-3 sentences)

2.  Defining your data

2.1 Where do your data come from?
I record interviews with my participants using a digital audio recorder, then transcribe them into text.
I test my catalyst under a number of conditions, then submit samples of the products to analysis facilities.
I generate data using model code that I’ve written, then process the data in various ways to produce visualisations.
I take high-resolution digital photographs of artefacts recovered in the field, and sometimes send samples off for analysis.
I combine existing data from a number of sources [give examples…] and re-analyse them to derive new conclusions.
2.2 What formats are your data in?
Audio recordings are stored as MP3; transcripts are stored in Word documents.
Experimental observations are recorded in a paper notebook, while recordings from instruments are stored in the proprietary format of the instrument.
My code is written in python, with the input data in .fasta files and the results output as .csv files.
2.3 How often do you get new data?
All of my data will come from two 3-month field trips at the end of my first and second years.
I expect to run two or three experiments each week through my second year and much of my third year – about 100 in total.
I will conduct a series of 30 interviews and 100 surveys every six months for two years.
2.4 How much data do you generate?
Each experiment produces about 50MB of data, so over the course of my PhD I expect this to add up to about 700GB.
I expect my consent forms to fill one ring binder and completed questionnaires to fill two filing cabinet drawers.
Each simulation generates 4TB of temporary data but I will only retain the output file, which is 100GB.
Based on other members of my research group, I expect to fill 5 lab notebooks during my PhD.
2.5 Who owns the data you generate?
According to my studentship agreement, the University owns all data I create, but I regain the copyright on publications based upon my data.

3.  Looking after your data

3.1 Where do you store your data?
My primary copy is on the university’s managed data storage (the X: drive), to which both my supervisor and I have access, and I copy files to my laptop to work on while I’m away from the office.
My participants’ responses are in a locked drawer within my supervisor’s locked office.
A raw copy of my data is retained by the facility where I ran my experiments. I have a second copy on an encrypted hard drive.
Most of my data are stored in my supervisor’s area of the X:drive, but data from my statistical modelling will be stored by my CDT at the University of Bristol.
3.2 How are your data backed up?
Data stored on the university research storage system is backed up by Computing Services. I make sure I copy the latest versions of my working files there each day.
I regularly scan my paper-notebook and move the digital copies to my group’s area on the X:drive.
I backup my data to my supervisor’s storage, which is available to the whole of my research group.
I access my backup monthly and open some files to check that they are still usable.
3.3 How do you structure and name your folders and files?
I use the structure <experimentdate/<reagent>-<replicate-number>.
A folder for each project phase, and within those a folder for each interview.
Each filename starts with the date on which the data were collected in YYYYMMDD format.
I use folder names to organise the data, and then the equipment/model automatically numbers all files created within that folder.
3.4 How do you manage different versions of your files?
As I survey new cohorts, data are appended to the dataset and saved as a new file.
There is only ever one version of each data file — new experiments create new data, which are stored in a new set of files.
Each time I run a new version of my model, intermediate files are written over, but the final results are saved as a new file.
I use a SubVersion repository to manage the code that I write.
3.5 What additional information is required to understand the data?
I keep additional notes about interviews in a Word document with the audio recordings and transcripts.
Abbreviations used for column headings are kept in a separate ‘readme’ text document.
The content of digital photographs is recorded in the file name.
The equipment I use embeds information about the settings in the metadata for the files.
I use comments to document my code as I write it.

4.  Archiving your data

4.1 What data should be kept or destroyed after the end of your project?
I will keep all of my data, both raw and processed.
Only simulation code and input parameters will be kept, but the intermediate files will be destroyed.
I will keep anonymised transcripts of all interviews, but will destroy the original audio recordings.
I purchased a licence to use a database for the duration of my studies which requires that I destroy the files at the end of my PhD.
4.2 For how long should data be kept after the end of your project?
My data will be kept only until the end of my PhD.
My funder requires that my data are kept for 10 years after the end of the project.
4.3 Where will the data you keep be archived?
As required by my ESRC funding, I will submit my final data to the UK Data Service.
I will deposit my crystal structures in the eCrystals archive.
My data will be published as supplementary information to support a publication.
I will publish my data in the Figshare data archive.
My commercial sponsor will retain a copy of all of my project data.
4.4 When will data be moved into the archive?
I will archive the data when I submit my thesis.
I will archive a copy of data supporting my findings when a paper based upon them is accepted for publication.
My commercial sponsor will handle this.
4.5 Who is responsible for moving data to the archive and maintaining them?
I am responsible for depositing my data in an archive and the archive service will maintain them.
I will pass my data to my supervisor at the end of my PhD and they will be responsible for archiving them.

5.  Sharing your data

5.1 Who else has a right to see or use this data during the project?
Only my supervisor should have access to my data during the project.
Others in my research group and my supervisor’s industrial partners will need to see some of my data.
5.2 What data should or shouldn’t be shared openly and why?
All my data are covered by a confidentiality agreement with my industrial sponsor and cannot be shared.
Some of my data identifies individual patients and must be anonymised before sharing.
Not all of my participants gave informed consent for their anonymised data to be shared, so I will exclude their results from the final dataset that I publish.
All of my data may be shared openly at the end of my project when my research findings are published.
5.3 Who should have access to the final dataset and under what conditions?
Data will be embargoed for 12 months to enable patent applications to be filed.
Bona fide researchers will be granted access to the data upon request.
My code will be made openly available under a GNU GPL licence.
My supervisor will arrange to share our bacterial strains with other researchers subject to a materials transfer agreement.
5.4 How will you share your final dataset?
Users will be able to download my data from the Figshare repository where they are archived.
The data can be downloaded from the UK Data Service, once users have registered.
Individual requests to access the data will be handled by my supervisor and agreed by my supervisors’ industrial collaborator. The data will only be shared if users sign a non-disclosure agreement.
It will not be possible to share any of my data for the reasons described above.

6.  Implementing your plan

6.1 Who is responsible for making sure this plan is followed?
I will take responsibility for carrying out the actions required by this plan and report them to my supervisor as appropriate.
6.2 How often will this plan be reviewed and updated?
My supervisor and I will review this plan every 6 months and will agree updates if necessary.
6.3 What actions have you identified from the rest of this plan?
Ask for access to my supervisors’ research storage space.
Set up a backup system and periodically test that I can restore from my backup.
Learn how to anonymise my data so that they can be shared.
Ensure that I request informed consent from my participants for sharing their data.
Scan my important results from my notebook at the end of each week.
6.4 What policies are relevant to your project?
This project is covered by the University of Bath Research Data Policy and the EPSRC Policy Framework on Research Data.
The project is sponsored by an industrial partner and is covered by a collaboration agreement and my studentship agreement.
6.5 What further information do you need to carry out these actions?
Where can you find this information?
Who might you be able to ask?

Page 4 of 5 © University of Bath