Skip to Main Content

Research Data Management (RDM): File Formats

One stop shop for all things related to Research Data and how to manage your data throughout its entire lifecycle

A colourful line border

File Formats

The file format you choose for your data is a primary factor in someone else's ability to access it in the future.

Think carefully about the best file format to manage, share, and preserve your data. Technology continually changes, and all contemporary hardware and software should be expected to become obsolete.

Consider how your data will be read if the software used to produce it becomes unavailable. Although any file format you choose today may become unreadable in the future, some formats are more likely to be readable than others.

Choosing good formats will improve the accessibility of your research and make it easier for yourself and other future researchers to use or reuse with a wide range of computer systems regardless of available software packages.

When performing research, it’s often necessary to use specialised and proprietary file formats. This may be for many reasons: your method of data analysis; the hardware used; the software available to you or to meet discipline-specific standards. Regardless of these issues, it’s still important to make a conscious and informed decision on choosing file formats.


At a minimum you should consider:


At later stages of your research, such as when publishing traditional research outputs or making your data publicly available, you should consider transferring your data to a file format that can be utilised by people who may not have access to the exact suite of software you have.

Researchers may sometimes encounter situations where they absolutely must use a proprietary/discouraged file format. In this case, they should make every possible effort to provide a backup version of the file in a different format. They should also provide documentation explaining how to use the problematic format.

Formats likely to be accessible in the future are:

✦ Non-proprietary

✦ Open, with documented standards

✦ In common usage by the research community

✦ Using standard character encodings (i.e., ASCII, UTF-8)

✦ Uncompressed (space permitting)

Examples of preferred format choices:
Decorative icon: images Image JPEG, JPG-2000, PNG, TIFF
Decorative icon: moving image Moving images MOV, MPEG, AVI, MXF
Decorative icon: text Text plain text (TXT), HTML, XML, PDF/A
Decorative icon: audio Audio AIFF, WAVE, MP3, MXF
Decorative icon: containers Containers TAR, GZIP, ZIP
Decorative icon: databases Databases XML, CSV
Decorative icon: statistics Statistics ASCII, DTA, POR, SAS, SAV, R
Decorative icon: geospatial Geospatial SHP, DBF, GeoTIFF, NetCDF
Decorative icon: tabular data Tabular data CSV
Decorative icon: web archive Web archive WARC

A list of recommended file formats

✦ The ETH Zurich library has a list of recommended file formats for data preservation.

✦ The UK Data Service Recommended File Formats table can help you use a file format best suited to long term accessibility.

Tabular data

Tabular data warrants special mention because it is so common across disciplines, mostly as Excel spreadsheets.

Favour open, low-tech formats

If you do your analysis in Excel, you should use the "Save As..." command to export your work to .csv format when you are done.

CSV (comma-separated values) files may not look as good as a native Excel file, but they have multiple advantages when preserving tabular data:

✦ They are simple: they can be opened and read even with a simple text editor.

✦ They are open: the development of software that can use them is not hindered by intellectual property.

✦ Being open also means they are not attached to a single software system and are compatible with many different options.


Your spreadsheets will be easier to understand and export if you follow best practices when you set them up, such as:

✦ Don't put more than one table on a worksheet

✦ Include a header row with understandable title for each column

✦ Create charts on new sheets- don't embed them in the worksheet with the data


When this is not possible, favour the most popular file format

While Word or Excel files are not open or low-tech formats, their ubiquity means that they should remain readable in the foreseeable future. You might lose some of the formatting or formulas, but some sort of compatibility should remain.

Some proprietary file formats were even developed specifically for preservation purposes: PDF/A, for example, will stand the test of time better than the average PDF.

Whenever using a specific format and software, you should always document [hyperlink to Documenting page of the guide] the version of the software you used to create, use, and save the data.

Knowledge Clip: File Formats

From Knowledge clip: file formats [Video], by UGent Open Science, 2020, Ghent University. (https://www.youtube.com/watch?v=kxxlQnc8u1I). CC BY.

Ask us at the Library

   08 8946 7016

   +61 4 8885 0811 (text only)

   askthelibrary@cdu.edu.au

   Book an Appointment

   Frequently Asked Questions

Helpful links

Library home page
Study Skills site
Language and Learning support home page
Current students information page
Reading list link
Distance learning help
Recorded workshops
Past exam papers
Charles Darwin University acknowledges the traditional custodians across the lands on which we live and work, and we pay our respects to Elders both past and present.
CRICOS Provider No: 00300K (NT/VIC) 03286A (NSW) RTO Provider No: 0373 Privacy StatementCopyright and DisclaimerFeedback • ABN 54 093 513 649