Ensure that your data will be accessible long after your project is over. It is important to think about a long-term plan from the earliest outset of your project so that you can set aside enough time and resources to accomplish it.
What should I keep?
It is important to be selective about what data you plan to retain, as every file requires some measure of overhead in terms of storage and maintenance for the long term. It’s a good idea to:
- keep anything irreproducible, such as observations specific to a particular time and place,
- retain results that are tied to a specific publication or presentation,
- discard intermediate tests or failed experiments at the end of a project.
How long should I keep it?
Check with your funding agency to find out if there is a specific policy that spells out a data retention period. For publicly funded research in the US, this is often a minimum of three-years. It is better to aim for even longer, if possible, in case you or someone else need the data later on. Five to ten years is a good rule of thumb.
Keep files readable
Making sure your data remain accessible for the long term is a big challenge, especially since technology changes so quickly. Choosing the right file formats can help avoid obsolescence. Use formats that are:
- Non-proprietary, open, documented standards (e.g., .tif, .txt, .csv, .pdf)
- Used commonly in your research community
- Encoded with standard characters (e.g., ASCII, UTF-8)
Tools and resources
- Re3data.org is a global registry of data repositories organized by academic discipline. A rating system and faceted browsing can help you find the best place to deposit your data.
- Texas ScholarWorks (TSW) is UT’s web-accessible DSpace repository, managed by UT Libraries. A free and secure place for archiving and sharing faculty research output, it provides persistent URLs, searchable metadata, full-text indexing and long-term preservation.
- The Texas Data Repository (TDR) is now open! Hosted by the Texas Digital Library, and based on Harvard University’s Dataverse platform, TDR will serve as a long-term solution for the preservation and dissemination of UT’s research data.