Skip to Main Content
University of Minnesota

Kathryn A. Martin Library

Research Data Management (RDM)

  •  How will the data be archived for preservation and long-term access?
  •  How long should it be retained?
  •  What file formats? Are they long-lived?
  •  Are there data archives that my data is appropriate for ?
  •  Who will maintain my data for the long-term?

Data formats for preservation and UDC

These formats are based on platforms that are low-cost or free and have a stability.  A repository will be able to support these format and guarantee long term preservation. In some cases, your data might require emulation or a proprietary software to view and examine . In these cases, A data repository has different levels of support governed by a preservation and access policy.  For example, the University of Minnesota Digital Conservancy offers the following levels of support: Full Support (Level 1)

  • Will take all reasonable actions to maintain usability.
  • Actions may include migration, emulation, or normalization.
  • Will ensure access and data fixity.

Limited Support (Level 2)

  • Will take limited steps to maintain usability.
  • May actively transform a file from one format to another to mitigate format obsolescence.
  • Will ensure access and data fixity.

Minimal Support (Level 3) 

  • Will provide for access to the item in its submission file format only.
  • Will work to ensure data fixitty

File Formats Recommendations

Documents

 

Level One

Level Two

 Level Three

Word Processing

  • PDF/A (.pdf)
  • EPUB (.epub)
  • Open Office (.sxw; .odt)
  • PDF (.pdf)
  • Rich Text Format (.rtf)
  • Microsoft Word (.docx)
  • Microsoft Word (.doc)
  • Google Docs (.gdoc)

Text

  • Plain Text (.txt)

 

 

Structured Text

  • XML (.xml)
  • HTML (.html)
  • Cascading Style Sheets (.css)
  • DTD (.dtd)
  • LaTex (.tex)
  • Tex (.tex)
  • Markdown (.md)

 

Presentation

  • PDF (.pdf)
  • PowerPoint (.pptx)
  • OpenOffice (.sxi/.odp)

 

 

Structured Data

 

Level One

Level Two

Level Three

Tabular Data

  • Comma Separated Values (.csv)
  • Delimited Text (.txt)
  • Microsoft Excel  (.xlsx)
  • OpenOffice (.sxc; .ods)
  • Microsoft Excel (.xls)

Databases

  • SQL DDL (.sql)
  • Sqlite version 3 (.sqlite; various)
  • DBF (.dbf)

 

Statistical Data

  • Comma Separated Values (.csv)
  • Delimited Text (.txt)
  • Delimited text with command file for statistical software
  • R (.rdata)
  • SPSS (.por, .sav)
  • SAS (.sas7bcat)
  • Stata (.dta)
  • MatLab (.mat)

Geospatial

  • Geographic Markup Language (.gml)
  • GeoTIFF (.tiff)
  • GeoPackage Encoding Standard (OGC) Family (.gpkg)
  • ESRI Shapefiles (.shp; .shx; .dbf; various)
  • GeoJSON (.geojson)
  • Keyhole Markup Language (.kml, .kmz)
  • LiDAR (.las, .laz)
  • AutoCAD Drawing Interchange Format (.dxf)
  • ESRI/ArcGIS Geodatabase (.gdb)
  • ESRI Interchange File Format (.eoo)
  • CAD data (.dwg)

Other

  • NetCDF (various)
  • HDF (various)
  • JSON (.json)
  • CDF (various)

 

Audio-Visual Materials

 

Level One

Level Two

 Level Three

Image

  • TIFF (.tiff; .tif)
  • PNG (.png)
  • Scalable Vector Graphics (.svg)
  • JPG (.jpeg; .jpg; .jfif; .pjpeg; .pjp)
  • Bitmap or BMP (.bmp)
  • GIF (.gif)
  • Google WebP (.webp)
  • JPEG 2000 (.jp2)
  • Encapsulated Postscript (.eps; .epsf; .ps)
  • GIF(.gif)
  • Macromedia Flash (.swf)
  • Photoshop (.psd; .psb; .acv; .atf)
  • RAW (various)

Audio

  • WAVE (.wav)
  • Broadcast WAVE (.bwf)
  • AIFF (.aif; .aiff)
  • MPEG Audio Layer III (.mp3)
  • Advance Audio Coding (.mp4; .m4a; .aac)
  • Windows Media Audio (.wma)

Video

  • FFV1
  • Matroska Multimedia Container (.mkv)
  • AVI (Audio Video Interleaved) (.avi)
  • Digital Moving Picture Exchange (.dpx)
  • QuickTime Movie (.mov)
  • Apple ProRes (.mov)
  • MPEG-2 (.mpg; .mpeg)
  • MPEG-4 (.mp4)
  • Windows Media Video (.wmv)
  • High Efficiency Video Coding (.hevc)

 

Archive File Formats

 

Level One

Level Two

Level Three

Email

  • MBox (.mbox)
  • Internet Message Format (.eml)
  • Personal Storage Table (.pst)
  • OLM (.olm) 
  • Microsoft Outlook Item (.msg)
  • PDF (.pdf)

Archive

  • ZIP (.zip)
  • Tape Archive (.tar)
  • CPIO (.cpio)
  • gzip (.gz)
  • 7z (.7z)
  • bzip2 (.bz2)
  • RAR (.rar)

 

3D

 

Level One

Level Two

Level Three

Embedded Texture

  • Extensible 3D (.x3d)
  • glTF (.gltf; .glb)
  • Universal 3D (.u3d)
  • Filmbox (.fbx)
  • Universal Scene Description (USD) (.usd; .usda; .usdc; .usdz)

No-Embedded Texture

 

  • Stereo Lithography (.stl)
  • Reflectance Transformation Imaging (.rti)
  • Polygon File Format (.ply)
  • Wavefront (.obj)
  • COLLADA Digital Asset Exchange (.dae)
  • Blender 3D (.blend)
  • 3D Studio (.3ds)

 

Software/Code*

Level One

Level Two

Level Three

 

  • Computer Program Source Code (Various)
  • Compiled or Executable Files (various)