Data Reuse Practices among Zoologists: When Materiality Matters

Elizabeth Yakel, Ph.D.

Zoologists regularly reuse digital data about specimens as well as the specimens themselves. However, we have no understanding of the circumstances surrounding why zoologists select digital data versus viewing the actual specimen. For example, is the digital data missing or unclear, is the data needed not contained in the standard metadata required, or does the research require reexamination of the actual specimen? These questions are important for the development of zoological databases, metadata standards, as well as for museums when deciding what data to share with national and international repositories, such as HerpNet, Genbank, FishNet, or the Global Biodiversity Information Facility (GBIF).

Ilerbaig (2010) argues that specimens are records,but so are the databased records representing specimens held in natural history databases. These representations differ widely from abbreviated descriptions to those with rich metadata sometimes paired with images or longer DNA sequencing information. Similarly, although we have some evidence that social expectations surrounding these databases differ (Bowker 2005; Bourne 2005; Costello et al. 2013), we know relatively little about the dynamics of data reuse (McLaughlin et al. 2001; Pereira 2013; Stoltzfus et al. 2012; Wickett et al. 2012), particularly how and when zoological researchers move between databases and actual specimens when reusing data.

This research project will enhance our knowledge of natural history collections in museums and the strengths of the data practices of repositories holding information about the specimens in these collections and advance understanding in multipledomains. Findings will contribute to the social study of science, focusing on whydata reuse practices differamong zoologists. They also will enrich knowledge about how well and when surrogates can stand in for the actual biological specimens. Finally, the findings will contribute to our knowledge of cyberinfrastructure in zoology by informing the design of tools and services to better support research and data curation practices.

Student Role:

The student will serve as a research assistant for the project and be involved in research meetings with my other students working on other projects. In this way s/he will peripherally participate in a research team and be exposed to a variety of projects. For the “When Materiality Matters” project, the student will reanalyze 33 interviews with and observations of zoologists originally collected by the Dissemination Information Packages for Information Reuse (DIPIR) project. Using NVivo, a qualitative data analysis application, the student will be guided by the mentor in developing a coding scheme, coding and developing interrater reliability, and analysis of the results to better understand how the decisions about the information selection, particularly databased information versus actual specimens, are made. Even though the student will largely be analyzing existing data, I will provide the opportunity for the student to collect some additional data on this project (interviews and observations) in order for the student to develop data collection (particularly interview) skills.

Contribution to Student Academic and Professional Development.The student will gain experience with collecting, analyzing, and presenting qualitative data and what is required to develop a publishable scholarly paper.

Mentoring Plan:

I will meet with the studentseveral times on a weekly basis throughout the project. Additionally, I will monitor the studentclosely during several key phases of the project, including coding, interviewing, and observing data (specimen) reuse in museums. I will give the student constructive and iterative feedback on all aspects of the research. Finally, I will involve the student in the development of the final article on this topic.

References:

Bourne, P. E. (2005). Will a Biological Database be Different from a Biological Journal.PLoS Computational Biology, 1(3), 179–181. doi:10.1371/journal.pcbi.0010034

Bowker, G. C. (2005). Databasing the World: Biodiversity and the 2000s. In Memory Practices in the Sciences (pp. 107–136). Cambridge, MA: MIT Press.

Costello, M. J., Michener, W. K., Gahegan, M., Zhang, Z.-Q., & Bourne, P. E. (2013). Biodiversity Data Should Be Published, Cited, and Peer Reviewed. Trends in Ecology & Evolution. doi:10.1016/j.tree.2013.05.002

Ilerbaig, J. (2010). Specimens as Records: Scientific Practice and Recordkeeping in Natural History Research. American Archivist, 73(2), 463–482.

McLaughlin, R. L., Carl, L. M., Middel, T., Ross, M., Noakes, D. L. G., Hayes, D. B., & Baylis, J. R. (2001). Potentials and Pitfalls of Integrating Data From Diverse Sources: Lessons from a Historical Database for Great Lakes Stream Fishes. Fisheries, 26(7), 14–23.

Pereira, S. (2013). Motivations and Barriers to Sharing Biological Samples: A Case Study. Journal of Personalized Medicine, 3(2), 102–110. doi:10.3390/jpm3020102

Stoltzfus, A., O’Meara, B., Whitacre, J., Mounce, R., Gillespie, E. L., Kumar, S., and Vos, R. A. (2012). Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis. BMC Research Notes, 5(1), 574. doi:10.1186/1756-0500-5-574

Wickett, K. M., Sacchi, S., Dubin, D., & Renear, A. H. (2012). Identifying content and levels of representation in scientific data. Proceedings of the American Society for Information Science and Technology, 49(1), 1–10. doi:10.1002/meet.14504901199