Organize

Folder Structures

Organize your data by building an efficient, easy-to-follow, hierarchical folder structure. Consider dividing data into the following categories:

  • Project/Unique Identifier
  • Time/Date
  • Location
  • File type

Consider the following example from DataDryad:

DataDryad Folder Structure


File Naming Conventions

File naming conventions are standards that are intended to help you describe content in a descriptive and consistent manner. Good file naming practices will help you keep your work organized and ensure content can be easily identified and interpreted by collaborators. File naming conventions can be used to track file versions, identify when changes were made, find data, and determine how files relate.

Following file naming standards makes it easier for data to be processed using automated workflows such as scripts and other tools.

Including your established naming convention with your metadata increases how quickly and efficiently other researchers can reproduce the results of your research.

Establish a Naming Convention

Define your file naming convention based on aspects that are important to the project including common naming conventions for your discipline. Set the standard at the beginning of your project and follow it consistently. Use the following as a general guide:

  • Keep file names shorter than 30 characters
  • Use descriptive names and include characteristics such as:
    • Unique identifiers (grant numbers, project numbers, etc.)
    • Project, study, or experiment name
    • Experiment conditions
    • Location
    • Researcher name/initials
    • Date or date range
    • Version number
    • File type
  • Avoid special characters (/ \ : * < > [ ] $ & ~ ! # ? { } ‘ ^ %)
  • Use underscores ( _ ), hyphens ( – ), or camelCase instead of spaces or periods
  • Include dates in the file name, such as YYYYMMDD or YYYY-MM-DD
    • Consider structuring the date at the beginning to keep files in chronological order
  • Use leading zeros for sequential numbering for version tracking
    • Avoid terms such as “final”, “complete”, or “revised”

Examples

Poor

  • December_003_data!
  • FINAL_manuscriptdraft_FINAL_05
  • 5432-Jane.Doe-10/5
  • Meeting notes March 3.doc


Better

  • 2021-12-05_ProjChamelon_MeetingNotes.pdf
    • [YYYY-MM-DD_ProjectName_Description_ResearcherName.filetype]
  • 2210703_MicroscopeXLR_12:00_07.tiff
    • [YYYMMDD_InstrumentName_Time_ImageID.filetype]
  • AudioExperiment_Analysis_005.wav
    • [ExperimentName_Analysis_Version.filetype]


File Formats

It is best practice to choose file formats that can stand the test of time. To increase the probability that your data is usable in the future, it’s best to choose file formats that have the following characteristics:

  • Open/sustainable (with documented standards)
  • Non-proprietary (vendor-independent)
  • Commonly used by the research community
  • Uncompressed (no “lossy” compression)
  • Standard representation (such as ASCII or Unicode)
  • No embedded files, programs, or scripts

Recommended Formats 

  • Image: JPEG, JPG-2000, PNG, TIFF
  • Video: MPEG-2, MPEG-4, MOV
  • Audio: AIFF, WAVE
  • Text: plain text (TXT), HTML, XML, PDF/A
  • Containers: TAR, GZIP, ZIP
  • Databases: XML, CSV

In cases where it may be necessary to preserve the proprietary file format, document which software was used, the version, and provide a copy of the files in a converted open format.

Additional Guidance


Organize Projects

Learn how to organize projects with the Open Science Framework (OSF).

 

Do You Have Questions?