Statistical Design

Text Box: MEMORANDUM                                  136 NW 40th St.

Seattle, WA   98107

(206) 632-4635

 

DATE:           April 13, 2003

TO:                 Ed Chadd and Hannah Merrill, Streamkeepers of Clallam County

FROM:           Leska S. Fore, Statistical Design

RE:                 Scoring criteria for averaged metrics

 

Clallam County’s current B-IBI protocol assumes three replicate samples for each stream site. Two of B-IBI’s 10 metrics are calculated on the basis of the cumulative number of taxa across all three replicates. For approximately 5% of the replicates, the number of macroinvertebrate individuals fell below 250 in one or more of the replicates.

*** Does the fact that you used the cutoff number of 250 indicate that we can use it as well?  See the revised “Level III Macroinvertebrate Analysis” text, attached in email.

It’s rather arbitrary where to draw the line, between 250 and 300 is where I have seen the greatest difference. So, yes, let’s go with 250 as a cut off.

For these cases, the preference would be to exclude a low replicate because taxa richness increases as a function of the number of individuals identified and a similar effort for all three replicates is needed for an accurate assessment of the macroinvertebrate assemblage. A replicate with a low number of individuals should be excluded, particularly if there is reason to suspect that the substrate was not typical of the site, for example.

If  one or two of the replicates are excluded, a problem arises because the cumulative taxa richness calculated from two replicates will not necessarily be comparable to the taxa richness calculated from three replicates. To address this problem, I evaluated data from 84 site-visits for which all three replicate samples had greater than 250 individuals identified. Long-lived and intolerant taxa richness were summarized as the number of cumulative taxa across all three replicates (the current protocol) and as the average of three replicates. Plotting the two different measures for each metric revealed that the average taxa richness was typically lower than the cumulative taxa richness, as expected (Figure 1.) Using the regression line, I translated the scoring boundaries for cumulative taxa richness measures into the corresponding scoring boundaries for average taxa richness (Figure 2).

*** Would we get even less scatter in the regression line if we plotted the cumulative totals against the number of taxa in the greatest rep rather than the average of all the reps?  I’m still wondering if we wouldn’t be better off with that metric.

I am not concerned with a perfect agreement in the regression line. Statistically, the maximum or greatest value is not a recommended measure because it has some unpleasant properties in terms of variability. It is not robust or reliable. So I would go with the average.

For long-lived taxa richness, the current scoring boundaries for the 5-3-1 metric scores are at 4 and 8; for average taxa richness, the boundaries would be 2.5 and 5.5. For intolerant taxa richness, the current scoring boundaries for the 5-3-1 metric scores are at 2 and 4; for average taxa richness, the boundaries would be 1 and 3. These values can be used to calculate metric scores when less than three replicates are available and will not bias the final B-IBI value.

*** Should we go ahead and use the average rather than the cum when analyzing ALL of our data (even site-visits that have 3 good reps)?  Would this yield more consistent and/or accurate results?

You could. I don’t think it’s going to make a difference. But to be consistent with published papers by Karr et al. it might be better to stay as close to the published protocol as possible, just for credibility. I don’t think it makes a difference in the ‘answer’, that is the B-IBI value, but it takes too long to explain why and how you’re changing the protocol for it to be practical.