About DataSheets

A DataSheet concisely describes a particular scientific dataset in a way that is useful to people who are interested in learning from or teaching with the data. It provides educationally relevant metadata to facilitate exploration of the data by educators and students.

DataSheets highlight the connections between datasets and specific topics in science. They also explicate how to acquire, interpret, and analyze the data. Information is presented at a level appropriate for those who don’t have specialized knowledge of the discipline in which the data are commonly used. The sheets are designed to support novice or out-of-field data users by providing them with the knowledge necessary to obtain and use data appropriately for scientific explorations. DataSheets also provide the meanings for acronyms and other jargon that users are likely to encounter, and include links to journal articles and educational resources that cite or use the data.

DataSheets have a number of content fields, each with a well-defined structure. The goal of this structure is to ensure consistency across the range of DataSheets, enabling users to explore a wide variety of data in an efficient manner. A growing collection of DataSheets is available at http://serc.carleton.edu/usingdata/browse_sheets.html

Generating DataSheets

This document describes the fields of a DataSheet and shows an example entry for each one. Please enter information into the template for a single dataset. Complete as many fields as possible, leaving those that are outside your experience or expertise for others. Save the completed template document by appending the dataset name to the current file name.

DataSheet Template

Author(s)

Indicate who prepared the DataSheet and acknowledge experts who were consulted in the process.

Example:

This DataSheet was created by Heather Rissler of SERC in consultation with Bryan Dias of the Reef Environmental Education Foundation.

Author(s) / This DataSheet was created by Holli Riebeek of NASA Goddard Space Flight Center/SSAI

DataSheet title

Enter the title for the DataSheet in one of the following formats:

A.  Exploring ‘x’ data (where x is the data source and/or type)

Example: Exploring USGS streamflow data

B.  Exploring ‘x’ using ‘y’ data (where x is a topic and y is the source or type of data).

Example: Exploring Population Dynamics using National Marine Mammal Laboratory Data.

DataSheet Title / Exploring vegetation index data from the MODIS sensor on NASA's Terra satellite

URLs

List 2 URLs and link text for each:

1)  link to the homepage of the data site and

2)  direct link to the data access point

Example:

Homepage URL / http://www.ncdc.noaa.gov/paleo/education.html
Link text / Homepage for World Data Center for Paleoclimatology Data
Data Access URL / http://www.ncdc.noaa.gov/paleo/pubs/linsley2000/linsley2000.html
Link text / Access Coral Radioisotope Data
Homepage URL / http://neo.sci.gsfc.nasa.gov/
Link text (generally the name of the page) / NASA Earth Observations
Data access URL / http://neo.sci.gsfc.nasa.gov/Search.html?group=24
Link text (generally “Access x data” where x is the data source or type of data) / Vegetation Index [NDVI] (MODIS)

Data Description

Give a brief description of the data including how they are presented and their geospatial and/or temporal extent. Give enough information for users to decide whether they are interested in exploring the data.

Example:

The site provides processed data in graphical form illustrating salinity, temperature, fluorescence, and density of ocean water for a transect station in the Gulf of Mexico near Sarasota Springs, FL.

Data Description / NASA Earth Observations (NEO) provides global data in a variety of image formats (jpg, png, GeoTiff, Google Earth) and in a data table that can be exported to Excel.

Graphic Representation of Data

When possible, give the URL to a non-copyrighted graphic that shows what the data product available at the direct link to data site looks like. If no graphic is readily available, list simple directions for producing a visible picture of the data.

Example:

Image URL / http://nwis.waterdata.usgs.gov/nwis/peak/?site_no=02037500
Image Credit / Map of annual peak streamflow for the James River near Richmond, VA. Map generated using USGS historical streamflow data.
Image URL / http://neo.sci.gsfc.nasa.gov/Search.html?group=24
Image Caption and Credit / Regularly updated global maps of monthly vegetation conditions from 2000 through at least 2007 displayed in the NEO interface.

Use and relevance

This section should discuss the importance of the data, using as little jargon as possible. It should concisely describe how scientists use these data, including what questions they helps answer, and how. It should describe why those questions are important to science as well as their relationship to issues effecting society more broadly.

Example:

The Mote Marine Laboratory Phytoplankton Ecology Program focuses on microscopic plants in the oceans, many of which produce harmful toxins. The program has a particular focus on the marine dinoflagellate Karenia brevis which is responsible for the Florida red tide. Eating red tide infected shellfish can be fatal to humans. Red tides are controlled by a variety of factors including nutrient availability and viral infections (see Review). Scientists use data generated from the Phytoplankton Ecology Program to better understand conditions under which red tide blooms develop.

Use and relevance / Our lives depend upon plants and trees. They feed us and give us clothes. They absorb carbon dioxide and give off oxygen we need to breathe. Plants even provide many of our medicines and building materials. So when the plants and trees around us change, these changes can affect our health, our environment, and our economy. For these reasons, and more, scientists monitor plant life around the world. Today, scientists use NASA satellites to map the "greenness" of all Earth's lands. These vegetation index maps show where and how much green leaf vegetation was growing for the time period shown.
Scientists routinely produce global NDVI maps to help them monitor and investigate shifts in plant growth patterns that occur in response to climate changes, environmental changes, and changes caused by humans. Farmers and resource managers also use NDVI maps to help them monitor the health of our forests and croplands. These maps are useful both for scientific research as well as societal benefit.

Data type

Describe the nature of the data (e.g. raw, processed, modeled) and how the data are presented (e.g. graphically, tab-delineated text file).

Example:

Raw data is processed and represented as graphic images in GIF format. Annual images for each measured parameter are available for the years 1998 to 2004.

Data type / Raw satellite data are processed, composited and projected in a cylindrical map projection to provide monthly images in jpg, png, geotiff, and kmz formats for the years 2000 through the present. Data are also provided in a comma separated value file.

Accessing data

Explain how to obtain the data. This should include specific guidance on how to find the data within the site and what exactly will be available when they reach the data. As necessary (if guidance is not provided by the data access interface) include descriptions of the fields to address and what the default values will produce.

Example:

Users select dates for which they want data and click links to access a GIF file. The GIF images show processed data as maps that illustrate transects and vertical profiles.

Accessing data / Users select the Land tab on the NEO Web site, and then select Vegetation Index [NDVI] (MODIS) from the list of datasets. Select a month to display a global map in the preview box. Click on "analyze this image" to use NEO's analysis tools, or select an image or data format and click on "get image" to download an image or data.

Acronyms, Initials, and Jargon

List and define acronyms, initials, or discipline-specific jargon users will encounter.

Example: RAMP = Radarsat Antarctic Mapping Project

Acronyms, initials, or jargon / NEO = NASA Earth Observations
NDVI = normalized difference vegetation index
MODIS = Moderate Resolution Imaging Spectroradiometer

Data tools

List and briefly describe data manipulation tools (software) that can be used to work with the data, including any tools that are integrated into the data access site. When possible, provide information on obtaining the tools and links to relevant tutorials and tool documentation.

Example (for Data tools)

The USGS site does not provide tools for data manipulation. Raw data can be downloaded and imported into a spreadsheet application (stet) for further processing.

(Seems like simply including links to tutorials (like above), and listing them again in the Ed. Resources area might work here)

The Starting Point site provides a tutorial for using Excel. Surf your Watershed: An example from Integrating Research and Education that guides users through the EPA's Surf your Watershed tool, which incorporates data from multiple sites, including USGS streamflow data.

Data Tools / The analysis tool embedded in NEO allows for simple analysis. A guide is available at http://earthobservatory.nasa.gov/Laboratory/ICE/ice_user_guide.html
Data may also be analyzed in a spreadsheet in programs such as Excel.

Visualizing data

Suggest ways in which users might manipulate the data to generate visualizations. To leave the door open for innovative exploration, be explicit that each suggestion is only ‘one way’ to visualize the data (unless the nature of the data is such that only one process will work).

Example:

One way that users can process this data is to create graphs from the raw data. The raw data are provided in HTML tabular format and tab delineated text files; these can be imported into a spreadsheet application such as Excel. Graphs could be used to visualize changes in streamflow over time and to display the relationship between gage height and streamflow. This data set could be combined with precipitation data sets to create graphical representations of streamflow-precipitation relationships.

Visualizing data / One way that users can process this data is to use NEO's analysis tool to find an average vegetation value for a given area for a series of months, and then chart that value in a line graph to find out how vegetation changes over time. This dataset could be compared to precipitation, temperature, and cloudiness to chart out the influence of these variables on plant growth.

Collection methods

This section should provide an overview of the details on how the data are collected (including information on instrumentation, transmission of data, and post-processing of data).

Example:

Collection methods have varied historically. The U.S. Geological Survey uses stream-gaging systems to measure water height, with data being transmitted to stations via telephone or satellite. Manual methods for directly measuring or inferring streamflow (discharge) data from gage height have been replaced by Acoustic Doppler current profilers that use sound waves to measure velocity, depth, and path (which are used to calculate streamflow rates).

Collection Methods / As can be seen through a prism, many different wavelengths make up the spectrum of sunlight. When sunlight shines on objects, certain wavelengths are absorbed and other wavelengths are reflected. The pigment in plant leaves -- chlorophyll -- strongly absorbs visible light for use in photosynthesis. The cell structure of the leaves, on the other hand, strongly reflects near-infrared light. The more leaves a plant has, the more these wavelengths of light are affected. Scientists exploit this knowledge of plants' interactions with light to map the density of green vegetation across Earth's landscapes by designing satellite sensors to measure the wavelengths of red and near-infrared light that is absorbed and reflected by plants all over the world.
Subtracting plants' reflectance of red light from near-infrared light and then dividing that difference by the addition of the red and near-infrared light reflected produces a resulting value that scientists call Normalized Difference Vegetation Index (NDVI). The NDVI maps were made using data collected by the Moderate Resolution Imaging Spectroradiometer (MODIS) aboard NASA's Terra satellite, which has been collecting data since late 1999.
A detailed description of the NDVI data product is available at http://modis.gsfc.nasa.gov/data/atbd/atbd_mod13.pdf

Sources of error

This section should give an overview of the sources of error related to data collection and processing. It should also discuss limits inherent in any underlying model or representation and indicate how these limits circumscribe the applicability of the data set and conclusions drawn from it. When applicable, provide a link to a section of the data site or a reference to a paper discussing error in the particular data set.

Example:

Limits to the accuracy of these data vary historically: current methods for directly measuring discharge are generally more accurate than the historical inference of this parameter. The article ‘Stream Flow Measurement and Data Dissemination Improve’ (link) discusses issues related to streamflow data quality.

Sources of Error / Satellite-based vegetation values can only be measured in cloud-free areas. Persistent cloud or aerosol cover results in limited measurements, which may be a source of error.
Reflected sunlight (glare) may mask vegetation values.
The maps in NEO are projected in such a way that northern areas cover a wider area than equatorial areas. This means that if students use NEO to find average vegetation values for a wide area that includes both polar and equatorial regions, the value will be skewed towards polar values.
The monthly composite averages vegetation values for an entire month. Small changes that occur over just a few days may disappear in the monthly composite.

Scientific resources

List up to 5 known scientific resources that refer to the data set. Include review articles or research articles that discuss topics and/or concepts related to the data. These articles should be relevant to users who are working with the data set and need additional background on the related science.

Example:

·  'Earthquake prediction: A seismic shift in thinking' is an article from Nature that discusses the debate regarding accuracy in predicting earthquakes.

·  'Mantle Convection and Plate Tectonics: Toward an Integrated Physical and Chemical Theory' is an article from Science that reviews the physics of plate tectonics.

Scientific Resources / --MODIS Data product description: http://modis.gsfc.nasa.gov/data/dataprod/dataproducts.php?MOD_NUMBER=13
--Goetz, S.J., Bunn, A.G., Fiske, G.J., and Houghton, R.A. (2005) Satellite-observed
photosynthetic trends across boreal North America associated with climate and fire
disturbance.
--Proceedings of the National Academy of Sciences, 102(38), 13521-13525.
Arctic Climate Impact Assessment. (2004). Cambridge University Press.
http://www.acia.uaf.edu/
--Boston University maintains a list of scientific papers about the use of satellite measurements
of vegetation to monitor change:
http://cybele.bu.edu/greeningearth/ge.articles.html
--The Global Inventory modeling and mapping studies group at NASA
Goddard Space Flight Center maintains a list of additional scientific
studies about vegetation index data:
http://gimms.gsfc.nasa.gov/publications/publicationsGIMMS.htm

Heading for Use in Teaching and Learning