The Responsible Collection, Retention, Sharing, and Interpretation of Data
Author(s): Caroline Whitbeck
Background and Module Contents
"Research data include detailed experimental protocols, primary data from laboratory instruments, and the procedures applied to reduce and analyze primary data." Responsible Science, Volume I: Ensuring the Integrity of the Research Process, Panel on Scientific Responsibility and the Conduct of Research, National Academy of Sciences, National Academy of Engineering, Institute of Medicine, 1992.P. 138
The integrity of research results is the sine qua non of scientific research. To ensure the integrity of research results, data must be treated with a scrupulousness that exceeds the care with which we treat most information in daily life. Fabrication or falsification of data are of course unacceptable, but there are many other matters of responsible data collection, retention, sharing and interpretation that bear on the integrity of data or on other matters of research ethics, such as the fair treatment of collaborators including fair apportionment of credit, preservation of confidentiality of research subjects or of the proprietary knowledge of sponsors and collaborators. There are also prudential reasons for many of the same norms for the collection, retention, sharing and interpretation of data, such as preservation of one's own claims to priority in discovery or invention, although it will be the ethical matters that will be the focus of this module.
Some general norms for the responsible management of data can be simply stated and are found in various forms in recent policy statements of many research institutions and funding agencies. Among them are:
- The primary data, the methods used to obtain them, and the procedures applied to the primary data to create compilations of them or derivations from them must be accurately reported.
- If the primary data are based on human observation, those observations should be recorded promptly and accurately and in sufficient detail to preserve the record of factors that might turn out to be significant, and in a way that minimizes doubt about the time of the occurrence or the time at which it was recorded.
- The data record should be kept reasonably free from risk of damage. Where practicable, copies may be made of the data either to facilitate sharing among collaborators or to further safeguard it.
- Necessary research materials should be made available to others who attempt to replicate your work.
- Institutions that are the recipients of research grants own the data from those research projects. The Principal Investigator (PI) for a project has custody of that data and primary responsibility for maintenance of the data record and such matters as preserving the confidentiality of sensitive information about human subjects, if any, in the data record. Collaborators on the research project for which the data was collected, including trainee collaborators, have the right of access to the data.
- The research data should preserved for a reasonable number of years after the appearance of final reports or publications resulting from the research. The amount of time varies with the nature of the research.
Good data management practices depend the character of the data. Data may be in the form of a written record, photographs, gels, or, as in high energy physics, in the form computer record of masses of data, which, though primary, is filtered as it is collected, to exclude what is judged to be "noise". Students and other trainees need to have field-specific criteria for data management made explicit in order to understand the specific actions required of them.
This module informs participants about the emerging literature on responsible data management. Through the use of scenarios that bring out many areas of potential confusion or conflict, the module helps groups to develop agreement on more specific field-specific and institution-specific norms for supervisors and trainees in their relationship with one another. Topics include data collection, compilation of primary data, interpretation of data, ownership, custody and access to the data, and issues of the dissemination research results.
Back to Top
Method and Scenarios
Distribution of scenarios and related discussion questions to the students and faculty.
Below are scenarios from other modules that also raise issues of responsible data management
- Panel discussion based on those scenarios and questions and any others that students or faculty wish to add.
Back to Top
Readings (recommended preparation for the discussion of scenarios)
If you do biomedical research, it is useful to read the following brief sections of the International Committee of Medical Journal Editors' "Uniform Requirements for Manuscripts Submitted to Biomedical Journals." This statement was published in 1997 in the New England Journal of Medicine 335: 309-315, and was updated May 2000.
Back to Top
If you are in the physical sciences or engineering, the detailed advice given by the ICMJE above or the less detailed statement, but for engineers and chemists, more familiar source, the Ethical Guidelines to Publication of Chemical Research by the American Chemical Society(ACS). These guidelines were first created in 1985 and have served as a model for many other societies, including the Optical Society of America and the American Geological Society. This is a link to a pdf file with their latest (January 2000) version. Notice that although both sources agree on most points that they both address, the ACS Guidelines, in the final section of the ACS Guideline, Ethical Obligations of Scientists Publishing outside the Scientific literature, they take a more cautious view of the implications for later scientific publication of first publishing one's findings in another way.
If you do not read the ICMJE section, also read:
- To have an open and candid discussion of data management, including difficult issues, including those on which investigators may differ.
- Improved understanding of the data management practices that are appropriate to one's field.
- Establishment of some agreed upon methods of resolving conflicts and misunderstandings when they arise
Back to Top
Selected Bibliography (for further reading)
Bailer, John C. 1997, "Science, Statistics and Deception", Research Ethics: A Reader, Deni Elliot and Judy E. Stern,eds., Hanover, University Press of New England
Bird, Stephanie J., and Housman, David E. 1995. "Trust and the collection, selection, analysis and interpretation of data: A scientist's view." Science and Engineering Ethics 1 (October): 371-82.
Carlson, Adam. 2001 "Data Mining: Finding Nuggets of Knowledge in Mountains of Data", Northwest Science & Technology
Grinnell, Frederick. 1992. The scientific attitude. 2nd ed. New York: The Guilford Press.
IOM (1989). The responsible conduct of research in the health sciences. Washington DC: National Academy Press.
Jones, Anne Hudson and Faith Mclellan (Editors). (2000) Ethical Issues in Biomedical Publication. Baltimore: Johns Hopkins Press.
Macrina, Francis L. 1995. Scientific integrity: An introductory text with cases. Washington, DC: ASM Press.
Marshall, Eliot. 1991. "Fight over Data Disrupts Michigan State Project", Science 251, 23-24.
Marshall, Eliot. 1993. "MSU Officials Criticized for Mishandling Data Dispute", Science 259, 592-594.
Marshall, Eliot. 1993. Court orders 'sharing' of data. Science, 261,(16 July), 284.
Mishkin, Barbara. (1995). "Urgently needed: Policies on access to data by erstwhile collaborators." Science, 270. 927-928.
Rennie, Drummond; V. Yank and Linda Emanuel. (1997) "When authorship fails: A proposal to make contributors accountable." J Amer. Med. Assoc. 278: 579-585. A proposal for a policy change to make investigators less likely to seek or accept credit through the mechanism of undeserved authorship.
Resnick, David. 2000 "Statistics, Ethics, and Research: An Agenda for Education and Reform", Accountability in Research, Vol. 8
See also Carl Djerassi's novel Cantor's Dilemma. (New York: Doubleday, 1989) describes the agony of an investigator who finds that others are not able to replicate an important finding in a paper he co-authored.
Back to Top
Relevant Web Resources
- The Endocrine Society Ethics Advisory Committee, Ethical Aspects of Conflicts of Interest
- Will download a PDF. This review includes information on conflicts of interest both for the organization at large as well as for individual clinician and researcher members.
- Responsible Use of Statistical Methods
- Part of an online Ethics Series from North Carolina State University. Although this article begins with an emphasis on the question of when a failure of responsible treatment rises to the level of misconduct, it goes on to discuss many subtle issues about data and statistical treatment of them.
- Jack Fry's Interview
- This case raises two primary issues: data sharing and recognition of the contributions of others. The first issue concerns when it is appropriate to share the work of one's colleagues. If the standards for sharing the work of a colleague are not explicitly stated, the door is open for abuse.
- Avoiding Self-Deception in Science by Terry Ann Krulwich
- We are always influenced by our working hypothesis or preconceived notions. How do we avoid letting those notions influence how rigorously we test them and how we interpret data?
- "Publication Ethics: Rights and Wrongs"
- Stephen K. Ritter, Chemical & Engineering News, Washington. Science & Technology, November 12, 2001, Volume 79, Number 46, CENEAR 79 46 pp. 24-31, ISSN 0009-2347. Balancing obligations and interests surrounding dissemination of research is an arduous task. This article explores how objectivity relies on integrity and trust, the hallmarks of the scientific discovery and publication process.
- NIH Data Sharing Information
- This is an extension of NIH policy on sharing research resources, and reaffirms NIH support for the concept of data sharing.
- Responsible Conduct in Data Management
- The purpose of this online module is to give a brief introduction to integrity issues related to data management and increase researchers’ awareness of such issues. The module is intended for self-paced learning especially by those in the early stages of their research careers and become aware of data management issues that can be encountered when dealing with research data.
Cite this page:
"The Responsible Collection, Retention, Sharing, and Interpretation of Data"
Online Ethics Center for Engineering
National Academy of Engineering
Accessed: Thursday, July 24, 2014