13

Sample Size and Database Requirements for

Resampling and Reporting

BLM's Attainment of the National Riparian PFC Goal

for:

USDI Bureau of Land Management

Washington, DC

by

Paul R. Adamus

Dynamac International, Inc.

Rockville, MD

September 1998

13

Background

As part of the national RiparianWetland Initiative (USDI Bureau of Land Management 1996), the BLM intends to implement a goal of improving, by the year 2005, 90 percent of the "Functioning -- At Risk" riparian miles in the 48 conterminous states. Consequently, BLM needs to know if at least 90 percent of these miles have shifted from the "Functioning -- At Risk" (F/AR) category to the "Proper Functioning Condition" (PFC) category. These categories and the protocol used to determine them site-specifically are explained in Pritchard (1995).

It is projected that of all the riparian miles assessed in the 48 contiguous states by the end of FY99, there will be 17,300 miles that fall in the F/AR category (E. Luse, pers. comm.). BLM's need to know if at least 90% have shifted to the PFC category should not be interpreted to mean that 90% of the miles will be in the PFC category then, but rather that 90% (15,570) of the 17,300 reaches that in 1999 were F/AR will shift by the year 2005 to the PFC category. For example, the number of miles in the PFC category might shift from 14,680 (in 1999, the recognized base year) to 30,250 (in 2005), but this still would represent only 76% of all 39,905 miles assessed by the end of FY99.

The primary purpose of this report is to present a pilot assessment of the number of riparian miles that would need to be resampled to achieve various hypothetical goals for precision in estimating PFC trends.

Results of Pilot Assessment

The number of miles that would need to be resampled depends partly on how certain one needs to be that 90% (not, say, 40% or 60%) of the miles have indeed changed -- in other words, what confidence interval (margin of error) do we desire?

Also, BLM may be interested in focusing on a subset of the 17,300 miles that have been subject to new grazing management strategies since 1999 -- hypothetically, 5000 miles that were F/AR as of 1999.

Finally, BLM may be interested in determining whether the 90% goal has been attained in particular states or river basins, so another subset (say, all 1000 miles comprising F/AR in a particular state) might be requested.

Table 1 shows results of our calculations, assuming each of 2 confidence intervals (.20 and .10 -- i.e., 90 and 95% certainty of estimate) and 3 sample sizes (17,300, 5000, and 1000 miles). Appendix A contains documentation of the equations and references used to produce Table 1.


Table 1. Recommended sample sizes for PFC reassessments of Functioning - At Risk (F/AR) streams on BLM lands, 90% goal

Desired
Confidence Interval
around 90% goal
(i.e., allowable margin of error) / Number of
F/AR miles anticipated to exist in 1999 base period / Required number of F/AR miles that should be resampled
+/- 5% / 17,300
(1999 anticipated) / 140
(1% of all F/AR)
+/- 10% / 17,300
(1999 anticipated) / 35
(<1%)
+/- 5% / 5000
(management scenario) / 135
(3%)
+/- 10% / 5000
(management scenario) / 34
(1%)
+/- 5% / 1000
(state scenario) / 120
(12%)
+/- 10% / 1000
(state scenario) / 33
(3%)


We also calculated sample sizes needed to answer a more open-ended question: What percent of previously assessed sites have either :

(a) changed from F/AR to PFC or

(b) are F/AR and have not become PFC but are reported to be showing an improving trend?

Table 2. Recommended sample sizes for PFC reassessments of Functioning - At Risk (F/AR) streams on BLM lands, determination of percent of sites that are improving at all.

Desired
Confidence Interval
(i.e., allowable margin of error) / Number of
F/AR miles anticipated to exist in 1999 base period / Required number of F/AR miles that should be resampled
+/- 5% / 17,300
(1999 anticipated) / 147
(1% of all F/AR)
+/- 10% / 17,300
(1999 anticipated) / 37
(<1%)
+/- 5% / 5000
(management scenario) / 144
(3%)
+/- 10% / 5000
(management scenario) / 37
(<1%)
+/- 5% / 1000
(state scenario) / 129
(13%)
+/- 10% / 1000
(state scenario) / 36
(4%)

From Table 1 it is apparent that remarkably few reaches would need to be reassessed, either to measure BLM’s 90% goal or to simply determine what percent of the reaches had improved. The phrasing of the statistical questions in this case has rather little effect on the predicted sample sizes.

Assuming the usual 3-person team can complete a PFC assessment of a stream at a cost of approximately $200 - 600 per stream mile (W. Elmore, pers.comm.), from Table 1 it is apparent that the field work component needed to achieve goals of a national resampling of PFC sites would cost BLM nationally approximately $7000 - 216,600, depending on the precision desired and the ease of accessing the stream reaches selected for resampling. Data entry, database management, analysis, mapping, and reporting would add to these costs. Costs to an individual BLM state office for assessing the PFC trend just in their state would be approximately $6600 - $162,000. Sample sizes and costs would be considerably greater if users needed to test a hypothesis pertaining to a specific range of magnitude of change, e.g., "How certain are we that between 75 and 85% of the sites have improved?"

Recommendations for Statistical Design of a Resampling Strategy

If BLM decides to conduct a nationwide resampling of the F/AR riparian miles, we recommend that the revisited reaches[1] be selected in a statistically random manner without replacement, rather than specifying a fixed number to be selected (even if randomly) for resampling in each state. Random selection is most defensible because spatial correlation affecting the estimates of the mean and variance ceases to become an issue. Furthermore, simple random sampling from a finite population guarantees statistical independence, and estimates of the mean and variance from the sample are unbiased when making inference to the originally sampled reaches.

The pool of candidate reaches from which the reaches for revisitation will be randomly drawn should include all reaches categorized as F/AR in 1999. It should not include all reaches regardless of their previous categorization. If BLM wishes to determine trends at a state level, then the resampling reaches should be selected randomly from the pool of all F/AR reaches in the state. Similarly, if BLM wishes to determine the change in F/AR reaches that have recently been subject to new management practices, then the revisitation reaches should be randomly selected just from the pool of all F/AR reaches that have been subject to the new management practices, rather than possibly confounding the analysis by including untreated F/AR reaches.

Whenever possible, reaches should be reassessed by the same persons who did the initial assessment. Thus, any errors of judgment (misclassifications) will likely be systematic and have minimal effect on validity of the reassessment conclusions. It will be particularly important to revisit only those reaches whose upstream and downstream boundaries are known precisely. Otherwise, BLM crews may unknowingly be assessing a different reach than they think, invalidating any comparison with the original assessment. If there is any uncertainty about being able to relocate some of the reaches that were originally assessed, those reaches should be dropped and a substitute reach selected randomly from the pool.

Effects of PFC Misclassification

As with any sampling program, different field crews -- regardless of experience and training -- will sometimes classify the same reach differently. Or sometimes, the recommended PFC category will be recorded incorrectly on field sheets or in a database. Collectively, all of these sources can be considered the "misclassification error."

The typical magnitude and probability of misclassification error among PFC practitioners with various degrees of training and experience has seldom been measured. Nonetheless, some attention should be given to misclassification error because of its potential effect on validity of the sample size estimates for resampling. During this pilot assessment we conducted some computer simulations in search of generalities about the likely effect on sample size of various magnitudes and probabilities of misclassification errors. From the preliminary analysis each situation seemed unique and we conclude only that no generalizations can yet be made. However, given more specific details about an actual application, the effects of hypothetical errors could be modeled.

Current Status of PFC Data

As noted above, BLM aims to complete PFC assessments of all 39,905 stream miles on BLM lands in the conterminous United States by October 1999. From Table 3, it is apparent that states differ widely in the amount of assessment they have yet to complete in 1998-99.

Table 3. Completion status of initial PFC assessments as of September 1998, as reported to E. Luse of BLM Washington Office

State / % complete,
stream miles
(Lotic) / % complete,
standing water acres
(Lentic)
Arizona / 87 / 82
California / 82 / 97
Colorado / 93 / 87
Eastern States / not reported / not reported
Idaho / 58 / 19
Montana / 99 / 9
Nevada / 63 / 25
New Mexico / 95 / 8
Oregon & Washington / 48 / 44
Utah / 73 / 73
Wyoming / 79 / 63


Current Status of PFC Data Management

Data management includes entering the PFC data into a database (electronic format), adding locational identifiers, and interfacing the database with a GIS. Table 4 summarizes the current status of data entry.

Table 4. Completion status of PFC data entry as of March 1998, as reported to E. Luse of BLM Washington Office

State / Data Entry, % complete
(mean of reporting offices)
Arizona / 25
California / 7
Colorado / 59
Eastern States / not reported
Idaho / 68
Montana / 99
Nevada / 17
New Mexico / 77
Oregon & Washington / 41
Utah / 48
Wyoming / 28

With regard to locational identifiers, most BLM offices identify the PFC assessments by Township-Range-Section or Allotment numbers. A few offices (e.g., California) are using GPS units extensively to enter data directly in the field, but many offices have not yet assigned any locational identifiers to their PFC data. Often where they have, the identifiers do not allow for enough precision to associate the PFC ratings with stream reaches on available digital maps. This is a prerequisite for use with a state- or district-wide GIS, and for relocating reaches precisely during future resampling.

BLM offices are most often entering and maintaining their data using commercial software applications such as Lotus, dBase, WordPerfect, Excel, Applix, or Informix. A few BLM offices report they have developed some customized databases specifically for PFC data:

Colorado: San Juan, Uncompahgre Resource Area offices

Montana: can be accessed and queried via an Internet web page

New Mexico: statewide Riparian Database

Oregon: Prineville office

Utah: Cedar City, Kanab, Vernal offices

Several BLM offices indicated plans or a desire to link their PFC data with their GIS. Two BLM offices indicated they were holding off developing their own PFC data management system because they are hoping BLM will soon develop some data management system standards. If standards are to be developed for PFC data management and reporting, consideration should be given to integrating with data management systems already developed by the above offices.

Data Management Concerns

A potential concern is the sensitivity of some stakeholders and BLM staff to the public release of location-specific data. In some instances staff have spent considerable effort enlisting the support (and sometimes the participation) of stakeholders in the PFC assessment process. Occasionally stakeholders have responded with a growing sense of stewardship and commitment to long term improvement of the resource, but with simultaneous concern about unintended use of the data for regulatory purposes by state and federal (non-BLM) agencies. Although all PFC data are public and a determined person could associate most completed PFC assessments to within about 0.5 mile of a point on a map, consideration should be given to designing PFC data reporting systems such that they are capable of easily summarizing and displaying PFC ratings and checklist items only at a geographic scale that is sufficiently coarse to basically conceal the identity of individual leaseholders and adjoining landowners.

Consideration also should be given to insuring that BLM offices enter all the data from the PFC Standard Checklist, not just the summary ratings, and that reporting systems are designed to summarize such data. In particular, it will be crucial to include the data for the checklist item pertaining to whether the condition of an assessed reach is the result of factors that can or cannot be controlled by BLM. Without such information, tabular and cartographic summaries of riparian resource condition will be difficult to interpret accurately.

The Future

This pilot assessment has identified a workable and statistically valid design for nationwide resampling and reporting of riparian functional condition. As BLM offices work to complete the remaining reach assessments before October 1999 and enter their data electronically, simultaneous efforts to design and demonstrate a standard reporting system, including GIS components, should progress. Also, before resampling is initiated in any state, consideration should be given to quantifying measurement error (especially misclassification error) under a variety of conditions. Attempts should also continue to simulate the effects such nonsystematic error may have on projections for sample sizes needed to accurately assess trend in riparian condition.