NCBI » GEO » Info » GEO Metadata Validation RulesLogin

GEO Metadata Validation Rules

To improve GEO's processing rate and maintain a high standard of metadata collection, GEO has implemented an automated pre-checking service for metadata completeness, formatting and content in the metadata spreadsheet. After completion of FTP transfer for raw and processed data files, the completed metadata file should be uploaded on the Submit Metadata page.

Upon upload, the metadata file will be scanned and checked for formatting and content within seconds. For example, if a section (STUDY, SAMPLES, PROTOCOLS, PAIRED-END EXPERIMENTS) is missing, you will receive the error message "Uploaded file is missing mandatory section" and a table will appear with the name of the missing section. If you receive an error message, please correct the indicated fields of your metadata file and upload your file again. Uploading a complete metadata file will return the message "Your metadata file has been successfully uploaded". Successful uploading of the metadata file places your submission into GEO's processing queue and you will receive an email notification with your submission summary.

error nameerror message that you will receiveexplanation and how to fix
excel_parse_failure Uploaded file cannot be read. The file must be in Excel version 2007 or higher with .xlsx extension. The file is not an Excel version 2007 or higher file with .xlsx extension. GEO cannot process metadata files submitted with extension .txt, .csv, or .tsv. Do not compress the metadata Excel spreadsheet. A compressed metadata Excel spreadsheet cannot be read.
discontinued_template It appears that you have used a discontinued version of the metadata spreadsheet. Please use the above link to download the newest version and resubmit. Old versions of the metadata spreadsheet are not supported. Please download, complete, and submit the newest version of the metadata spreadsheet.
missing_worksheet Uploaded file is missing required worksheet named "Metadata". Please make sure you are using our newest metadata template. The Excel tab (also called a worksheet) containing the metadata information must be named "Metadata" or "2. Metadata Template". Any other tab name will produce the "missing_worksheet" error. For example, do not rename the tab "RNAseq" or "ChIPseq". Do not include multiple tabs with metadata for separate studies in the same file. GEO needs one metadata file per study.
missing_section Uploaded file is missing mandatory section: The metadata tab must have sections titled STUDY, SAMPLES and PROTOCOLS. If it is a paired-end sequencing study, the metadata file must also contain a PAIRED-END EXPERIMENTS section.
empty_samples_section SAMPLES section does not list any samples. Please make sure that library names do not start with "#" symbol since such lines are treated as comments and ignored. Samples must be listed in the SAMPLES section.
missing_mandatory_info Uploaded file is missing mandatory information in the STUDY or PROTOCOLS sections: Required fields in STUDY and PROTOCOLS sections are: title, summary (abstract), experimental design, extract protocol, library construction protocol, library strategy, data processing description, assembly or genome build, and processed data files format and content. Library strategy refers to the experiment type such as RNA-seq, ATAC-seq, or Hi-C. A table will be provided that lists the fields in STUDY and/or PROTOCOLS sections that are empty.
missing_sample_header SAMPLES section is missing required headers for the table: Deleting columns from the metadata template in the SAMPLES section is not allowed and will produce the "missing_sample_header" error. A table will be provided which lists the missing headers in the SAMPLES section. You can add columns to the SAMPLES section for additional characteristics appropriate for your samples. For example, you could use the header "overall survival" and provide survival data for each sample.
empty_library_name At least one of the samples has empty library name. In the SAMPLES section at least one of the samples has empty library name. Sometimes this error is caused by non-empty cells in the SAMPLES section that are not associated with the included samples.
missing_sample_info SAMPLES section is missing required information: Every sample in the SAMPLES section must include information for library name, title, organism, molecule, single or paired-end, and instrument model. A table will be provided which lists the missing field for each library name.
duplicate_library_names Identical library names were found. Library names must be unique. This check is case insensitive, meaning that "Control1" and "control1" will be considered identical. Identical names are: Every library name in the SAMPLES section must be unique. A table will be provided which lists the non-unique library name and the number of times it was found (occurrences) in the SAMPLES section.
duplicate_sample_titles Identical sample titles were found. Sample titles must be unique. This check is case insensitive, meaning that "Control1" and "control1" will be considered identical. Identical titles are: Every title in the SAMPLES section must be unique. A table will be provided which lists the non-unique title and the number of times (occurrences) it was found in the SAMPLES section.
Last modified: February 22, 2024