Formatting and Submission Guidelines for Data Papers
Data papers are a unique type of article published in Ecology, used to present large or expansive data sets, accompanied by metadata which describe the content, context, quality, and structure of the data. Metadata may contain limited statistical analysis of the data; more detailed analysis of data sets could, however, form the core of a companion article. There is no length limit for Data Papers.
Data papers are subject to full peer-review. The review process will evaluate ecological significance and overall quality first, but Data Papers will also undergo further technical review to ensure a high standard of usability, especially with respect to associated metadata.
The data being presented in a Data Paper must be archived with the Metadata by providing it as “Supporting Information for review and publication” at the time of submission. Wiley online provides long-term accessibility and maintenance of Data Papers. Due to the financial liability of long-term hosting and maintenance, there is a one-time fee of $250 at publication for Data Papers. Additional charges apply if the file sizes are deemed excessively large. If your data files are too large to be submitted via ScholarOne, please include a note in the “Data Archiving” field of the online submission form and contact the Peer Review staff at gro.asenull@slanruojase to explain the situation.
Data should be logically and consistently formatted. Our primary goal is to ensure that your files will be accessible and legible to every user, on every platform, for the foreseeable future – as such, we avoid posting files in a proprietary format.
- The most commonly submitted proprietary format is Excel (.xls). Excel spreadsheets should be converted to a plain text format, such as comma-separated values (.csv) or tab-delimited ASCII text (.txt).
- Software should be submitted both as source code and compiled (executable) code. Submitting compiled code without accompanying source code is not acceptable.
- Generally, synthetic data (e.g., figures) can accompany, but not substitute for, raw data in Data Papers.
- Synthetic results normally should be placed within the accompanying metadata text.
- Multiple files should be compressed and submitted together as self-extracting .ZIP or .RAR archives.
- If your operating system doesn’t have built-in functionality for compressing files into an archive, programs such as 7-Zip or WinRAR can do this.
- These folders should be named using the following convention: “DataS1”.
- If several sets of data or code are present, they should each be bundled in their own compressed folder, but named sequentially (Examples: DataS1.zip, DataS2.zip, DataS3.zip, etc.).
Metadata fully describe the content, context, quality, and structure of the data.
- The metadata should be submitted in a single .DOC, or .DOCX file.
- Text should be double-spaced in size-12 Times New Roman font.
- Metadata content should adhere strictly to the metadata content standards derived from Michener et al. (1997; Ecological Applications 7:330–342; see the outline below, in the next section)
- Questions about relevance of specific fields should be directed to the Data Editor, Dr. William Michener.
- Metadata text should generally adhere to the instructions for ESA print journals. The basic formatting rules can be found below.
- All text and table-figure captions in the Metadata should be double-spaced (three lines per inch) in size-12 Times New Roman font.
- Each line of the Metadata must be numbered to facilitate the review process.
- Lines can either be numbered continuously for the entire document or starting over on each page.
- Use leading zeroes with all numbers < 1, including probability values (e.g., P < 0.001).
- Use the International System of Units (SI) for measurements.
- Consult Standard Practice for Use of the International System of Units (ASTM Standard E-380-93) for guidance on unit conversions, style, and usage.
Assemble the metadata file in the following order. You can find formatting guidelines for each section following the list and you can jump to sections with Data Paper-specific guidelines.
- Title, Authors (data compilers), and Authors’ affiliations
- Abstract and Key Words (beginning on a new page)
- Metadata (Class I – Class V)
- Literature Cited
- Titles should be concise and informative. The maximum length is 120 characters (including spaces).
- ScholarOne enforces this limit in the “Title” field of the online submission form.
- If your title does not fit into this field, it must be shortened, and the manuscript should be changed to reflect the shortened title
- Do not include taxonomic names or numerical series designations in titles.
- Use sentence case, capitalizing only the first word and proper nouns.
Authors and data compilers
- For each author, state the relevant address–usually the institutional affiliation of the author(s) during the period when all or most of the data were collected.
- The authors’ present address(es), if different from this, should appear in parentheses.
- Provide a current, corresponding e-mail address to which questions regarding the data set can be directed.
- There can be only one corresponding author on a manuscript.
- Author names must be provided in the same order in the online submission form and in the manuscript.
- Due to the large number of authors found on most Data Papers, checking author names takes up a significant amount of time for the Peer Review staff. To expedite the review process, please make sure to complete the following
- Author names should be identical between the manuscript and in the online submission form
- Author names should be provided in the same order between the manuscript and the online submission form.
Abstract and key words
- The abstract should be brief (<350 words) and provide a brief summary of the database, including the purpose, methods, and results of completed analyses.
- Avoid speculation in the abstract.
- If included, speculation about possible interpretations or applications of your data should play a minor role.
- Do not include any literature citations in the Abstract.
- Common names may be used when convenient after stating the scientific names.
- Please supply up to 12 key words for indexing purposes.
The organization of the metadata should correspond to the Metadata Standard (see the table under “Metadata Guidelines”). All Classes (Class I-Class V) and appropriate fields must be completed. See the following for general guidelines on what to include in each section of the Metadata.
Research Origin Descriptors. The motivation or purpose of your research should appear here. State the questions you sought to answer, and the background of those questions.
Methods. You should provide enough information to allow someone to repeat your work. A clear description of the experimental design, sampling procedures, and statistical procedures is especially important in metadata describing field studies, simulations, or experiments. If you list a product (e.g., animal food, analytical device), supply the name and location of the manufacturer. Give the model number for equipment used. Supply complete citations, including author (or editor), title, year, publisher, and version number, for computer software mentioned in the metadata.
Data Documentation. Particular attention should be paid to providing comprehensive documentation of the physical structure of the data, known data anomalies, and quality assurance and quality control procedures employed. Contributors are encouraged to provide comprehensive documentation of supplemental descriptors that would facilitate secondary data use and interpretation. Before submitting the Data Paper, contributors should thoroughly review the metadata and verify that physical structure descriptors are enough to permit secondary usage of the data.
Statistical Analysis. This can appear in the metadata section, but it should be kept to a minimum. Such detailed analyses of data sets could, however, form the core of a companion paper submitted to an ESA print journal.
Metadata content standards using the “Class” system
Class I. Data set descriptors
- Data set identity: Title or theme of data set
- Data set identification code: Database accession numbers or site-specific codes used to uniquely identify data set
- Data set description
- Originator(s): Names and addresses of principal investigator(s) associated with data set
- Abstract: Descriptive abstract summarizing research objectives, data contents (including temporal, spatial, and thematic domain), context and potential uses of data set
- Key words: Location (spatial scale), time period and sampling frequency (temporal scale), theme or contents (thematic scale)
Class II. Research origin descriptors
- Overall project description: [Note: this section may be essential if data set represents a component of a larger or more comprehensive database; otherwise, relevant items may be incorporated into II.B.]
- Identity: Project title or theme
- Originator(s): Name(s) and address(es) of principal investigator(s) associated with project
- Period of study: Date commenced, date terminated, or expected duration
- Objectives: Scope and purpose of research program
- Abstract: Descriptive abstract summarizing broader scientific scope of overall research project
- Source(s) of funding: Grant and contract numbers, names and addresses of funding sources
- Specific subproject description
- Site description
- Site type: Descriptive (e.g., short-grass prairie, blackwater stream, etc.)
- Geography: Location (e.g., latitude/longitude), size
- Habitat: Detailed characteristics of habitats sampled
- Geology, landform: Soils, slope/elevation/aspect, terrain/physiography, geology/lithology
- Watersheds, hydrology: Size, boundaries, receiving streams, etc.
- Site history: Site management practices, disturbance history, etc.
- Climate: Descriptive summary of site climatic characteristics
- Site description
- Experimental or sampling design
- Design characteristics: Description of statistical/sampling design
- Permanent plots: Dimension, location, general vegetation characteristics (if applicable).
- Data collection period, frequency, etc.: Information necessary to understand temporal sampling regime
- Research methods
- Field/laboratory: Description or reference to standard field/laboratory methods
- Instrumentation: Description and model/serial numbers
- Taxonomy and systematics: References for taxonomic keys, identification and location of voucher specimens, etc.
- Permit history: References to pertinent scientific and collecting permits
- Legal/organizational requirements: Relevant laws, decision criteria, compliance standards, etc.
- Project personnel: Principal and associated investigator(s), technicians, supervisors, students
Class III. Data set status and accessibility
- Latest update: Date of last modification of data set
- Latest archive date: Date of last data set archival
- Metadata status: Date of last metadata update and current status
- Data verification: Status of data quality assurance checking
- Storage location and medium: Pointers to where data reside (including redundant archival sites)
- Contact person(s): Name, address, phone, fax, electronic mail
- Copyright restrictions: Whether copyright restrictions prohibit use of all or portions of the data set
- Proprietary restrictions: Any other restrictions that may prevent use of all or portions of data set
- Release date: Date when proprietary restrictions expire
- Citation: How data may be appropriately cited
- Disclaimer(s): Any disclaimers that should be acknowledged by secondary users
- Costs: Costs associated with acquiring data (may vary by size of data request, desired medium, etc.)
Class IV. Data structural descriptors
- Data set file
- Identity: Unique file names or codes
- Size: Number of records, record length, total number of bytes, etc.
- Format and storage mode: File type (e.g., ASCII, binary, etc.), compression schemes employed (if any), etc.
- Header information: Description of any header data or information attached to file [Note: may include elements related to variable information (IV.B.); if so, could be linked to appropriate section(s)]
- Alphanumeric attributes: Mixed, upper, or lower case
- Special characters/fields: Methods used to denote comments, flag modified or questionable data, etc.
- Authentication procedures: Digital signature, checksum, actual subset(s) of data, and other techniques for assuring accurate transmission of data to secondary users
- Variable information
- Variable identity: Unique variable name or code
- Variable definition: Precise definition of variables in data set
- Units of measurement: Units of measurement associated with each variable
- Data type
- Storage type: Integer, floating point, character, string, etc.
- List and definition of variable codes: Description of any codes associated with variables
- Range for numeric values: Minimum, maximum
- Missing value codes: Description of how missing values are represented in data set
- Precision: Number of significant digits
- Data format
- Fixed, variable length
- Columns: Start column, end column
- Optional number of decimal places
- Data anomalies: Description of missing data, anomalous data, calibration errors, etc.
Class V. Supplemental descriptors
- Data acquisition
- Data forms or acquisition methods: Description or examples of data forms, automated data loggers, digitizing procedures, etc.
- Location of completed data forms
- Data entry verification procedures: Procedures employed to verify that digital data set is error free
- Quality assurance/quality control procedures: Identification and treatment of outliers, description of quality assessments, calibration of reference standards, equipment performance results, etc.
- Related materials: References and locations of maps, photographs, videos, GIS data layers, physical specimens, field notebooks, comments, etc.
- Computer programs and data-processing algorithms: Description or listing of any algorithms used in deriving, processing, or transforming data
- Archival procedures: Description of how data are archived for long-term storage and access
- Redundant archival sites: Locations and procedures followed
- Publications and results: Electronic reprints, lists of publications resulting from or related to the study, graphical/statistical data representations, etc.
- History of data set usage
- Data request history: Log of who requested data, for what purpose, and how data set was actually used
- Data set update history: Description of any updates performed on data set
- Review history: Last entry, last researcher review, etc.
- Questions and comments from secondary users: Questionable or unusual data discovered by secondary users, limitations or problems encountered in specific applications of data, unresolved questions or comments
Acknowledgments should be brief.
- Check each citation in the text against the Literature Cited to see that they match exactly.
- Format references to conform in style to the ESA print journals.
Tables and figures
- Tables and figures should be embedded in the metadata where appropriate.
- Tables should be in HTML
- Figures should be embedded .JPG, .GIF, or .PNG files.
Appendices are not acceptable parts of Data Papers.
The files associated with a Data Paper can be separated into three categories.
- Data files which are published as Supporting Information
- These files should be consistently formatted in non-proprietary file formats (such as comma-separated values [.csv] or tab-delimited ASCII text [.txt]).
- Use file names descriptive of the contents and save them within a compressed folder (.zip or .rar) named “DataS1”.
- A metadata document that is published as Supporting Information
- This file should be named “MetadataS1”
- This document is not copy edited or typeset and thus should be formatted for publication by the author.
- Use our guidelines for compiling the Metadata.
- The abstract document for journal publication
- When uploading your files to ScholarOne, use the following file designations:
- Data files and the metadata document should be uploaded as “Supporting Information for review and publication”.
- The abstract document should be uploaded as the “Main Document”.
All Data Papers will receive at least two independent reviews. Final acceptance of Data Papers is by the Data Editor or the Editor-in-Chief.
Instructions for reviewers
The following instructions are sent to each Data Paper reviewer, along with directions for how to access the data and metadata:
Confidentiality. This Data Paper is a privileged communication. Please do not show it to anyone or discuss it, except to solicit assistance with a technical point. If you feel a colleague is more qualified than you to review the Data Paper, do not pass this responsibility on to that person without first requesting permission to do so from the Data Editor. Your review and your recommendation should also be considered confidential.
Time. In fairness to the author(s), you should return your review within 3 weeks. If it seems likely that you will be unable to meet this deadline, please e-mail the Data Editor today.
Conflicts of interest. If you feel you might have any difficulty writing an objective review, please contact the Data Editor. If your previous or present connection with the authors, data compilers, or an author’s institution might be construed as creating a conflict of interest, but no actual conflict exists, please discuss this issue in the cover letter that accompanies your review.
Comments for the authors. What is the major contribution of the Data Paper? What are its major strengths and weaknesses, and its suitability for publication? Please include both general and specific comments bearing on these questions and emphasize your most significant points.
- Importance and interest to users and readers.
- Scientific and technical soundness of the database.
- Degree to which metadata fully describe the content, context, quality, and structure of the database.
Specific comments. Support your general comments with specific evidence in “Comments for the Author(s)”. Comment on any of the following matters that significantly affected your judgment of the database:
Metadata presentation. Are the metadata logically organized and do they adhere to the Metadata Content Standard (see recent examples of Data Papers)? Do the title, abstract, and key words accurately and consistently reflect the major point(s) of the database? Is the writing concise, easy to follow, interesting?
Metadata completeness. Are the metadata complete and sufficient to facilitate interpretation and secondary use of the data? What portions of the metadata should be expanded? Condensed? Deleted?
Data organization. Are the data logically and consistently organized? Is the data format consistent with the format defined in the metadata?
Data quality. Were suitable methods employed to maintain the integrity of the original data and datasets? Are all data anomalies well-documented? Are the metadata sufficient to allow a secondary user to determine how outliers were identified and treated?
Data integrity. Have adequate procedures been employed to allow a secondary user to determine whether errors may have been introduced during data transmission (e.g., checksum techniques, file size)?
Methods. Are they appropriate? Current? Described clearly enough so that the work could be repeated by someone else?
Study design. Is the design appropriate and correct? Can the reader readily discern which measurements or observations are independent of which other measurements or observations? Are replicates correctly identified? Are significance statements justified?
Errors. Point out any errors in technique, fact, calculation, interpretation, or style. (For style we follow the “CBE Style Manual, Fifth Edition,” and the ASTM Standard E380- 93, “Standard Practice for Use of the International System of Units.”)
Citations. Are all (and only) pertinent references cited?
Fairness and objectivity. If the research premise for the database is flawed, criticize the science, not the scientist. Harsh words in a review will cause the reader to doubt your objectivity; as a result, your criticisms will be rejected, even if they are correct! Comments directed to the authors should demonstrate that:
- You have carefully and thoroughly reviewed the data and metadata.
- Your criticisms are objective and correct, are not merely differences of opinion, and are intended to help the data originator improve his or her Data Paper.
- You are qualified to provide an expert opinion about the research that served as the impetus for the Data Paper.
- If you fail to win the data originator’s respect and appreciation, your efforts will have been wasted.
Anonymity. You may sign your review if you wish. If you choose to remain anonymous, avoid comments to the authors that might serve as clues to your identity.