Data Acquisition, Management, Sharing and Ownership
Good data management begins with creating a record of research that thoroughly, accurately, and clearly documents the work and evidence that went into creating a scholarly product, such as a paper, book, patent, computer program, etc. Beyond data collection, good data management also includes recognizing who owns data, when and how data should be shared, and when data can be destroyed.
What is Data?
Most definitions of data are very broad. For example, the National Institutes of Health (NIH) uses the following definition in its grants manual in connection with rules on the availability of research results.
For this purpose, "data" means recorded information, regardless of the form or media on which it may be recorded, and includes writings, films, sound recordings, pictorial reproductions, drawings, designs or other graphic representations, procedural manuals, forms, diagrams, work flow charts, equipment descriptions, data files, data processing or computer programs (software), statistical records, and other research data (NIH Grants Policy Statement (03/01)).
Further, NIH requires that researchers who receive its funds make available not only data, but also "unique resources," to other scholars. This includes a wide range of information and biologicals, such as "synthetic compounds, organisms, cell lines, viruses, cell products, and cloned DNA" (NIH Grants Policy Statement (03/01)).
Data Collection Guidelines
The most rigorous standards for data collection come from industry and human subjects research. Since Congress passed the Bayh-Dole Act in 1980, which gives universities control over intellectual property created by researchers with federal grants, patenting has significantly increased on university campuses and, with this trend, universities have moved towards industrial standards for data collection. These standards focus on what should be recorded in a laboratory notebook and how a notebook should be kept. Guidelines often recommend that notebooks include:
- Descriptions of reasons for experiments
- Experimental protocols
- Observations, measurements, and other experimental results
- Printouts, photographs, and other machine generated data
- Mathematical calculations performed on raw data
- Brief interpretations of the results
The following style conventions are widely recognized for laboratory notebooks:
- Permanent binding
- Consecutively numbered pages
- Tables of contents
- Explanations of abbreviations
- Dated entries
- If the date of the experiment is different from the date of recording, recording both dates
- Dating and initialing all changes
- Keeping legible and clear records in permanent ink
- Periodic review and signing of notebooks by someone not directly involved in the research
A traditional value of academic communities has been the sharing of research results. The federal government requires that data and unique resources created with its funding be shared and encourages timely dissemination of results through publication and presentation in academic venues. In the last two decades, emphases on industrial collaboration and patenting in medicine and the life sciences have challenged older values. For more on this topic, see the section on collaborative research.
Guidelines for data retention range from three years (NIH regulations) to twenty-three years (Patent Office). When no other concerns supersede, state regulations may apply.
What Rules Apply?
National Institutes of Health, with one of the most complete policies on data ownership, data sharing, and data retention as addressed in the NIH Grants Policy Statement, is a good place to begin to understand the issues. Other funding agencies may have their own policies concerning data management and investigators should check with the relevant funding agency.