Preserving research data helps to keep it accessible and usable into the future, despite changes in technology and possible hardware failures. Preservation planning should be a key element of your research project.
Well-managed data ensures your research findings can be replicated, and your conclusions backed up with evidence. Long term, preserving research data saves time and money by preventing duplication of research, and improves the quality of future research by providing new opportunities for existing data.
Preservation of research data should include not just the datasets themselves, but any related files giving the datasets context; for example, email discussions, methods of analysis, research parameters.
Data preservation is critically important as the cost of acquiring, processing and analysing data in the first place can be very high. There are also various institutional and funding body requirements that may require data be preserved for certain periods:
- the Deakin University Research Conduct Policy (Clause 14-28) outlines the required minimum periods for retention
- the Australian Code for the Responsible Conduct of Research Section 2 specifies the minimum period for retention of research data, with actual periods determined by the specific type of research.
It is strongly recommended that, wherever possible, research data be stored in the University's data store or network storage, ensuring it is backed up regularly and readily available to team members when required. It will also ensure long-term access by providing persistent identifiers.
Sometimes personal hard drives or external storage devices such as DVDs or USBs suit are more convenient than network storage. If you do choose to rely on non-network devices, always ensure you store the master copy of your data on the network, as it is all too easy to lose portable devices, or the data corrupted.
You should prepare for data preservation from the start of your project. The earlier you start planning, the easier it will be to ensure your data remains durable and accessible into the future.
Here are some things to consider:
- file formats may become obsolete over time
- it is therefore important to ensure the file formats you use to store your data are widely adopted, have a history of backward compatibility, and an open specification
- a guide to current robust file formats may be found at the UK Data Archive.
- like file formats, software may also become obsolete
- choose software that is widely used and well supported; and ideally, which uses file formats that are widely adopted.
File store media
- store your data on a network drive, to ensure the data is properly backed up and can be migrated to other media if needed.
- when multiple members of a research team have access to the same data files, it can be difficult to track changes from one person to the next. This makes version control very important
- if the software you are using does not support version control (e.g. Microsoft Word), you may need to set up explicit rules to ensure version tracking of files
- this could include keeping a single master copy of the files, including date/times as part of the file names.
- if you are not using network storage, ensure you regularly move your data onto fresh media, to guard against media degradation
- ensure you have a robust back up strategy that includes off-site storage. There are many tools and services available that will perform automatic backups at scheduled times; for instance Windows' own built-in tools
- regularly restore and check files to ensure that your backup strategy is working as expected.
Ownership and access
- allocate responsibility for data preservation to a member of your research team
- determine who will require access to your preserved data files and who will have ongoing responsibility and ownership of the data to avoid lost data if staff move on.
- determine how long your data should be kept, in compliance with the Deakin University Research Conduct Policy (Clause 14-28) and Australian Code for the Responsible Conduct of Research Section 2.
File organisation and file naming conventions
- organise your files in a tiered folder structure, with folder and file names clearly descriptive of the contents
- this helps to ensure particular files are easily located and specific data is more easily findable.
At any stage of your project, you can deposit your data in a data repository such as Deakin University's data store, Deakin Research Online (DRO) or a subject-specific data centre or archive. This can be a requirement of the funder or publisher of your research. Some examples of subject-specific archives include: