Skip to Main Content

Research Data Management (RDM): Version Control

One stop shop for all things related to Research Data and how to manage your data throughout its entire lifecycle

A colourful line border

Version Control

In the course of your research, you will often create multiple versions of your data files. New versions of a dataset are being created when an existing dataset is reprocessed, corrected, new data is added or merged etc. You will need a way to keep track of it all.

Data versioning, also known as, version control helps track changes associated with dynamic data — data that is not static over time, and it is a key step in good research data management.

You should always keep original versions of data files, or keep documentation that allows the reconstruction of original files. Version control should apply to different copies or versions of files, files held in different formats or locations, and information that is cross-referenced between files.


Data versioning is important for a few reasons:

✦ Researchers are required to identify and cite the exact dataset used as a research input in order to support research reproducibility and trustworthiness.

✦ Changes to the datasets will most likely change the conclusions derived from the dataset. In order to maintain integrity about where your conclusions came from, it’s important to know which version of your data you’re addressing.

The changes you make might ultimately not be useful. If this happens, you may need to go back to an earlier version of the dataset.


The most basic forms of versioning are manual systems. These usually contain two important elements:

  1. The user adds a sequential number to the file name to indicate which version of the file it is
  2. A change table in each document where versions, dates, authors and details of changes to the file are recorded
Version Control Software

While a manual system can work for many research projects, they can become difficult to use once your needs become complex or multiple people begin working on the same dataset.

In these cases, you should consider using version control software. Git is the best known and most widely used type of this software.

Git is a free and open-source distributed version control system designed to handle everything from small to large projects.

Version Control Table

In research projects, and especially collaborative work, it is often useful to record what changes were brought to each version of a file. A version control table records who did what and when. This can be embedded in the file itself (in headings, notes or metadata), or it can take the form of an attached spreadsheet or readme file.

The UK Data Services Versioning page provides an excellent guide on version control strategies and how to create a file history, version control table or notes included within a file, where versions, dates, authors and details of changes to the file are recorded.

Version Control Best Practice

Some best practices for working with versions include:

Decorative icon: data Always save the raw data file and make sure that no changes can be made to it (e.g. save read-only, save to a different, secure location, set access rights).
Decorative icon: save Save an untouched copy of the raw data, and leave it that way. Always work on something other than the “safe” untouched copy (it is always possible to go back to the “safe” data and make a new copy in order to start from scratch).
Decorative icon: earlier versions Keep earlier versions of your files, so that you and others can see the work that has been done. Best practice dictates that you keep at least the milestone versions of your files, ones where significant changes were made.
Decorative icon: systematic strategy Use a systematic naming strategy to identify different file versions, and include a version number in the file name (e.g. v01, v02, v03, etc.)
Decorative icon: consistent Be consistent in naming the different versions
Decorative icon: ambiguous Do not use ambiguous descriptions of the version, such as '_new', '_lastversion', '_revised', ‘_final’, ‘_final2’, etc. If available, use version control facilities within the software you use
Decorative icon: software If appropriate, use version control software (such as Git)
Decorative icon: date Record the date within the file: 20190902_documentation_for_my_data
Decorative icon: table Design and use a version control table
Decorative icon: documentation Document changes that have been made to which version of the file.
Decorative icon: decisions Decide how many versions of a file you want to save, which versions you want to save, how long you want to keep them and how you want to structure them.
Knowledge Clip: Version Control

From Knowledge clip: Keeping research data organized [Video], by UGent Open Science, 2021, Ghent University. (https://www.youtube.com/watch?v=7Ogbkx74Ym8). CC BY NC ND.

Ask us at the Library

   08 8946 7016

   +61 4 8885 0811 (text only)

   askthelibrary@cdu.edu.au

   Book an Appointment

   Frequently Asked Questions

Helpful links

Library home page
Study Skills site
Language and Learning support home page
Current students information page
Reading list link
Distance learning help
Recorded workshops
Past exam papers
Charles Darwin University acknowledges the traditional custodians across the lands on which we live and work, and we pay our respects to Elders both past and present.
CRICOS Provider No: 00300K (NT/VIC) 03286A (NSW) RTO Provider No: 0373 Privacy StatementCopyright and DisclaimerFeedback • ABN 54 093 513 649