Folder Structures
Organize your data by building an efficient, easy-to-follow, hierarchical folder structure. Consider dividing data into the following categories:
- Project/Unique Identifier
- Time/Date
- Location
- File type
Consider the following example from DataDryad:
File Naming Conventions
File naming conventions are standards that are intended to help you describe content in a descriptive and consistent manner. Good file naming practices will help you keep your work organized and ensure content can be easily identified and interpreted by collaborators. File naming conventions can be used to track file versions, identify when changes were made, find data, and determine how files relate.
Following file naming standards makes it easier for data to be processed using automated workflows such as scripts and other tools.
Including your established naming convention with your metadata increases how quickly and efficiently other researchers can reproduce the results of your research.
Establish a Naming Convention
Define your file naming convention based on aspects that are important to the project including common naming conventions for your discipline. Set the standard at the beginning of your project and follow it consistently. Use the following as a general guide:
- Keep file names shorter than 30 characters
- Use descriptive names and include characteristics such as:
- Unique identifiers (grant numbers, project numbers, etc.)
- Project, study, or experiment name
- Experiment conditions
- Location
- Researcher name/initials
- Date or date range
- Version number
- File type
- Avoid special characters (/ \ : * < > [ ] $ & ~ ! # ? { } ‘ ^ %)
- Use underscores ( _ ), hyphens ( – ), or camelCase instead of spaces or periods
- Include dates in the file name, such as YYYYMMDD or YYYY-MM-DD
- Consider structuring the date at the beginning to keep files in chronological order
- Use leading zeros for sequential numbering for version tracking
- Avoid terms such as “final”, “complete”, or “revised”
Examples
Poor
- December_003_data!
- FINAL_manuscriptdraft_FINAL_05
- 5432-Jane.Doe-10/5
- Meeting notes March 3.doc
Better
- 2021-12-05_ProjChamelon_MeetingNotes.pdf
- [YYYY-MM-DD_ProjectName_Description_ResearcherName.filetype]
- 2210703_MicroscopeXLR_12:00_07.tiff
- [YYYMMDD_InstrumentName_Time_ImageID.filetype]
- AudioExperiment_Analysis_005.wav
- [ExperimentName_Analysis_Version.filetype]
File Formats
It is best practice to choose file formats that can stand the test of time. To increase the probability that your data is usable in the future, it’s best to choose file formats that have the following characteristics:
- Open/sustainable (with documented standards)
- Non-proprietary (vendor-independent)
- Commonly used by the research community
- Uncompressed (no “lossy” compression)
- Standard representation (such as ASCII or Unicode)
- No embedded files, programs, or scripts
Recommended Formats
- Image: JPEG, JPG-2000, PNG, TIFF
- Video: MPEG-2, MPEG-4, MOV
- Audio: AIFF, WAVE
- Text: plain text (TXT), HTML, XML, PDF/A
- Containers: TAR, GZIP, ZIP
- Databases: XML, CSV
In cases where it may be necessary to preserve the proprietary file format, document which software was used, the version, and provide a copy of the files in a converted open format.
Additional Guidance
Organize Projects
Learn how to organize projects with the Open Science Framework (OSF).