Determining Boundaries between Abundance Biozones Using Minimal Equipment

The areal extent of a biological community is usually determined using statistical techniques that only give reliable results where samples contain similar and high numbers of specimens. This paper presents a simple, inexpensive method for determining the geographical limits of biological communities applicable where adjacent samples contain widely differing numbers of specimens. The method is a development of SHE Analysis, which discerns boundaries between adjacent abundance biozones (ABs), an AB being an area with a distinct community structure. As originally conceived, SHEbi (SHE Analysis for the identification of Biozones) commences with species’ absolute abundances and works best with large samples of equal sizes. If the variance in N (per sample) is high, SHEbi may place AB boundaries in unexpected locations. A modification, based on proportional abundances, is developed here using species’ proportional abundances (pi = ni/N) for each sample where ni is the number of specimens in the ith species in the sample. For intertidal foraminifera from the Caroni Swamp, Trinidad, where N , the number of specimens, fluctuates widely between samples, the modification (SHEbip) gives ecologically more sensible results than does traditional SHEbi.


Introduction ". . . a statistical analysis or test is not endowed with metaphysical properties; it cannot create good results from bad data!" [20, page 9]
A biological community is a group of interdependent organisms that lives and interacts within a habitat, such as fishes on a coral reef, birds in a forest canopy, or foraminifera within a mangrove swamp.The development of robust quantitative methods for grouping similar samples taken from the same biological community is vital for the recognition of biological communities that are real and not mere statistical artefacts.The boundaries between adjacent biological communities are detected using variations in assemblages of species across an area.The different communities contain either different dominant species or different species altogether.The programmes used to determine these boundaries are usually useful only in limited circumstances where sample sizes are uniform and large.A primary goal of both ecology and paleoecology is to understand the patterns by which groups of species are associated and distributed on the biosphere both at present and through geologic time.This paper presents a novel technique for discerning the boundaries between biological communities that require only Microsoft Excel, or a similar spreadsheet programme, and can be applied to data where the variance in sample size is high.

The Basis of SHE bi
SHE Analysis for Biozone Identification (hereafter abbreviated as SHE bi ) is a relatively new technique that groups samples within an abundance biozone (AB) by accumulating species' abundance data one sample at a time along a transect [1], an AB being an area within which the proportional International Journal of Ecology abundance of a particular species or group of species differs significantly from that in adjacent areas.
To demonstrate how a SHE bi is conducted, this paper first defines some basic terms (see also Glossary of Symbols).It uses as an example the steps followed in this study of intertidal foraminifera in the Caroni Swamp, Trinidad, a small island developing state located in the SE Caribbean Sea.
The intertidal area in this swamp supports a population of foraminifera-the population being the set of all the foraminifera living within the study area.First, a cupful of intertidal sediment is collected from the study area.It is washed to remove clay, silt, and large fragments of organic matter to leave sand and a sample of intertidal foraminifera-the sample being a subset of the population.One specimen-a single foraminiferia-is picked from the sample and identified.A second specimen is then picked and identified.Biological communities usually comprise a number of species.The second specimen could thus belong to either the same species as the first or to a new one.Consequently, the number of species S of foraminifera identified from the sample will increase as the number of specimens N increases.As further foraminifera are picked and identified, a total of N specimens are accumulated from the one sample for which N = i=S i=1 n i , where n i is the number of specimens in the ith species in the sample.Species richness S has been considered a measure of diversity (e.g., [2]).Unfortunately, when comparing samples of different sizes, the number of species S identified is not a helpful parameter [3,4] because within a sample S is typically proportional to N.
A better measure of diversity is Shannon's [5] information function H, which is based on proportional rather than absolute abundances.To obtain H for a sample, the proportional abundance p i of the ith species in the sample is first calculated from p i = n i /N.H is then defined by H = − i=S i=1 p i * ln p i .This term is also known as the Shannon-Weiner Index [6,7].Values of H are typically 1.5-3.5 and rarely >4.5 [8]; only when S > 10 5 species (which would require an extremely large sample size N) is H > 5.0 [9].
Once H has been obtained for the single sample, e H (the exponential of it) can be calculated.Jost [10] termed e H the "number equivalent" or "effective number of elements" of the information function.It tells the absolute number of equally abundant species that would be needed to produce the calculated value of H [11]. Thus, when all species in a sample are equally abundant, p i is constant across all species and e H = S.In practice, species vary in abundance within a population, such that some are common and some are rare in any particular sample and e H < S. The extent to which a few species dominate the sample (thus decreasing e H ) or, conversely, the degree to which species abundances are equitably distributed within it (thus increasing e H ), is termed the sample's population structure [1].The value of e H /S gives a measure of the degree to which one or a few species dominate, and is termed the equitability index E [12,13].This E = e H /S index ranges from 0 to 1, with lower values indicating greater dominance by a few species.
Thus we have SHE: S (species), H (information function), and E (equitability index).Figure 1 gives a cartoon of the entire process outlined above.
Taking the natural logarithm of the equitability index, we get ln E = H − ln S.This shows that ln E (which will be negative because 0 < E < 1) is the residual remaining when ln S is subtracted from H. Sheldon [14] has shown that for any one sample, E is dependent on the number of species S and that for any one sample E becomes progressively smaller as N (and S) increase.It follows that, as ln N increases, the increase in ln S must be balanced by changes in either H, ln E or both.Buzas and Hayek [15] outline possible behaviours of H and ln E.
If a graph of S against N is plotted, within a single sample the relationship between increasing S and N is usually so strong that under ideal circumstances the plot is asymptotic (e.g., [16]).Thus, a plot of ln S against ln N forms a straight line [1,17].In actuality, the world is a noisy place and some deviation from a straight line usually occurs.Also, it is frequently necessary to accumulate several hundred specimens before the values of p i become almost constant.
Buzas [18] hypothesised that most populations have a logarithmic series population structure.Hayek and Buzas [1] demonstrated that within a population with a logarithmic series structure, H becomes constant beyond a critical but variable value of ln N (see also [19]).So if, as an increasing number of specimens N are accumulated, a graph of H versus ln N is plotted, it will not be horizontal throughout, but will slope upwards until this critical value of lnN is attained (see [17,Figure 37, Station 1]).Practically, it is found that beyond this critical value, most additional species encountered are usually singletons (i.e., represented by single specimens only).The addition of a singleton to a large sample has negligible impact on H, the singleton having a very low proportional abundance [4].Buzas [18] suggested that the logarithmic series population structure should be used as the null model for determining population structures.
Where a sample is large, usually only an aliquot-a fraction of the total N specimens-is picked.It is nevertheless assumed that these specimens have been taken randomly from an effectively infinite population [20] so that the sample-or an aliquot of it-is statistically representative.Where the population being studied comprises a taxonomically related set of species (e.g., foraminifera) within a community that includes other organisms such as birds, gastropods, and mangrove trees, the taxonomically restricted population (in this case limited to foraminifera) is termed a taxocene [21].

SHE bi
With the above in mind, SHE bi may now be introduced.It is a statistical technique used to identify the point at which the population structure of a taxocene changes as a linear transect of sequential samples crosses a boundary between adjacent abundance biozones (ABs)-that is, crosses the boundary between two areas supporting populations with differing structures (i.e., with species present at differing proportional abundances) or compositions (with new species added in significant quantities).SHE bi can be used to define either modern (ecological- [15]) or ancient (both paleoecological and ecostratigraphic) AB boundaries based on changes in population structure or composition.That SHE bi is not used widely may in part arise from the tediousness of calculating successive measures using spreadsheets [15].It may also arise, however, from confusion engendered by a failure by previous workers to distinguish statistical measures obtained from a single sample from those derived from ≥2 accumulated but discrete samples.To overcome this confusion, several symbols are introduced here.N (= i=S i=1 n i ) is used to show the number of specimens in a single sample, L to denote the number of samples in the series, and M to indicate the number of specimens in the accumulated samples L. S A , H A , and E A are used to distinguish (a) values of these measures as computed from accumulated samples from (b) their values S, H and E as calculated using single samples.
In SHE bi , samples are accumulated along a line across the study area (a line transect) and ln N A , ln S A , H A , and ln E A recalculated as each new sample is added.Buzas and Hayek ([15, Figure 1]) showed using graphs of ln S A , H A and ln E A versus ln N A that all three measures can all change within an area with a uniform population structure (i.e., within an abundance biozone).H A will vary until a sufficient number of specimens have been accumulated to exceed the critical value of M in an area with a logarithmic series population structure.This possibility notwithstanding, ln S A , H A and ln E A change more markedly at the point where the line transect moves between ABs having different population structures.Buzas and Hayek [15] concluded ln E A versus ln N A to be the most sensitive indicator of such a transition.The graph of ln E A versus ln N A is essentially linear within an AB but shows a marked change in slope at an AB boundary where either (a) sufficient species have joined the accumulated samples to disturb the values of p i for at least some species markedly, (b) species proportions p i within the accumulated assemblage have changed markedly without new species joining the community, or (c) both have occurred.
SHE bi uses the successive addition of samples in a series, recalculating the information function H A and related measures (species richness S A and the equitability index E A ) as samples are accumulated.Where an additional sample is the same as the previous samples, there is no significant change in the value of the H.This contrasts with raw species richness, which increases with the greater overall sample size and is balanced by a decrease in the equitability index [7,10,22].Crossing an AB boundary results in sampling of a new community, with sharp jumps in S, H, and E indicating significant changes in the composition and structure of the population sampled.One challenge facing this cumulative approach is that eventually the accumulating list becomes so large that even the addition of a sample with a substantially different composition needs not have a large effect on H and E [23].This paper introduces a method termed SHE bip for use where the standard deviation of the sample size is high (>75%) relative to the mean.
where p i is the sum of the proportional abundances of the ith species across all the samples accumulated, and L is the number of samples accumulated.As successive samples are accumulated, H A is recalculated using each species' mean proportional abundances in those samples.Where N varies widely from sample to sample, this will induce differences in H A as compared with H A computed using SHE bi (which uses raw abundance data).Nevertheless, because lnS A is the same for both methods, the relation ln S A = H A + E A holds true whether SHE bi or SHE bip is employed and it follows that any differences in H A between SHE bi and SHE bip must be matched by differences in ln E A .Whereas in SHE bi an AB boundary is drawn wherever a graph of ln E A versus ln N A shows a break in slope, in SHE bip it is drawn where there is a break in slope on a graph of ln E A versus ln N S .The difference is illustrated here using two model data sets (Table 1, Figure 2) that show how the calculations are made.We used Microsoft Excel for our calculations.In Data Set 1, N is constant at 375 specimens per sample and M across all four samples is 1500.The addition of abundant Species E in sample S3 marks the move from one AB to another.This is reflected by a change in slope (here an increase) in the graph of ln E A versus ln N A , no matter whether ln E A is calculated using SHE bi or SHE bip (Figures 2(a) and 2(b), resp.).This will not be the case, however, where there are insufficient specimens in the added sample S3.In Data Set 2, sample S3 yielded only N = 25 specimens but marked the first proportionally abundant occurrence of Species F. When examined using SHE bi (Figure 2(c)), there is only a slight step between samples S2 and S3 that may be dismissed as being too subtle to be significant (cf.[15, page 237]).The significance of this break can be tested using simultaneous confidence intervals [24], but this can be tedious where a large number of species are involved, simultaneous confidence intervals having to be calculated for every species.Re-examination with SHE bip instead reveals a marked step between S2 and S3 indicative of an AB boundary (Figure 2(d)).

Materials and Methods
Wilson et al. [25] provide a description of the study area, which lies near the mouth of the Blue River in Caroni Mangrove Swamp, Trinidad.Samples of 75 mL each were taken along three line transects (Figure 3), each sample comprising the top centimetre of sediment.Samples from transects T1 and T2 were taken at 1 m horizontal intervals, while from the less steeply shelving transect T3 they were collected at 2 m horizontal intervals.Sample altitudes relative to annual mean sea level (AMSL) were determined using levelling and GPS.Transect T1 lay ∼1 m south of transect C1A of Wilson et al. [25].Within 48 hours of collection all samples were washed and sieved over a 1 mm mesh to remove coarse organic fragments, and a 63 µm mesh to remove mud and silt.Because this study examined total (live + dead) foraminiferal assemblages, the washed sample residues were stored in fresh water but not stained with rose Bengal.
Foraminifera were picked from the wet residues.An attempt was made to pick ∼250 specimens from all residues, but some yielded considerably fewer.Specimens were identified to species level using especially Todd and Bronnimann [26], Saunders [27,28], and Boltovskoy and Hincapié de Martínez [29].Wilson et al. [25] gave brief taxonomic details.
The aim of this paper being to compare how SHE bi and SHE bip behave where N varies markedly between samples, and not to document how AB boundaries differed between the three line transects, all three were spliced on the basis of increasing altitude relative to AMSL only.(Other splicing methods, such as ordering samples using detrended canonical analysis, might indicate different AB boundaries.)SHE bi and SHE bip were conducted for the three spliced transects and the results compared.ABs discerned by SHE bi were distinguished using italicised uppercase letters, and those indicated by SHE bip using italicised numerals.

General Characteristics of the Fauna.
A total of 34 samples were recovered from the three transects and yielded a total of 3638 specimens of benthonic foraminifera in 33 species.The altitudes of the individual samples relative to AMSL ranged from −1.18 m to 0.34 m.For the 34 samples, N varied from 0 to 377 foraminifera (mean = 107, standard deviation [S.D.] = 120.7).
Further analyses were, therefore, restricted to those L = 23 samples (∼68% of those collected) that yielded ≥20 specimens (Table 2) on the grounds that within these samples H  was not correlated with N. Within these samples the total number of specimens recovered (M) was 3547 foraminifera.The most abundant species were Ammonia sp.(31% of total recovery from these 23 samples), Arenoparrella mexicana (20%), Trochammina advena (22%), and T. inflata (10%).Ammonia sp.dominated the four samples farthest below AMSL (T1-11 through T1-8), which collectively yielded ∼30% of the total specimens recovered from the 23 samples analysed.
There was no significant difference between the mean yields of samples from transect T1, with the highest mean, and T2, with the lowest (Student's t-test; t obs = 1.255, t crit = 2.201, d.f.= 11).Thus, the observed variations in N have  not arisen from amalgamating transects with differing mean population densities.For all 23 samples, the mean N was 154 and S.D. 121, the S.D. being ∼78% of the mean.N was insignificantly correlated with S (r = 0.365, P =.087) and H (r = 0.237, P =.277) but significantly correlated with E (r = −0.755,P =.001).S did not show any trend throughout these samples, but H and E were markedly lower in those four samples near the base of the merged transects that were dominated by Ammonia sp.(Figure 3).For the 23 samples, per sample N as a percentage of the total recovery (i.e., total M) varied between 1% and 11% (mean 4.4%, S.D. 3.4%; Figure 5).

SHE bi and SHE bip
Both SHE bi and SHE bip were applied to the 23 samples with ≥20 specimens.They indicated complex but markedly different patterns of abundance biozones (ABs), SHE bi suggesting there were eight ABs and nine SHE bip (Table 3, Figures 6 and 7).The number of samples per AB as indicated by SHE bi ranged from two to five, whereas from SHE bip the number ranged from one to five.Sample T3-9, although comprising >10% of the recovery, was not differentiated as a separate AB by either SHE bi or SHE bip .Only four AB boundaries indicated by SHE bi coincided with those from SHE bip , and only two ABs were identical between the two methods (AB1 from SHE bip with ABA from SHE bi , and AB9 from SHE bip with ABH from SHE bi ) .

Discussion
The results from both SHE bi and SHE bip reflect complex fluctuations in the proportional abundances of species  (Figure 5).Examination of the raw data shows, however, that, due to fluctuations in N, use of SHE bi induced spurious placement of AB boundaries along the merged transects T1 through T3.
The statistical validity of the changes in the proportional abundances of Ammonia sp. and T. advena between T1-8 and T1-7 was tested using simultaneous confidence intervals  Following deletion of the first four samples, grouping of the next two as AB2.Figures 7(c) through 7(i) continue this sequential procedure until all samples are accounted for.[24], using a value of z = 2.12 to avoid a Type II statistical error.(This test could not be applied to A. mexicana, S. lobata and M. Fusca because these were not recovered from T1-8).The results indicate that the decrease in the proportional abundance of Ammonia sp. between T1-8 and T1-7 is statistically significant, but that the change in T. advena was not.The decrease in Ammonia sp.being coupled with the appearance of A. mexicana, S. Lobata, and M. fusca, is concluded that there is a change in population structure and composition between samples T1-8 and T1-7.
In line with the above, both SHE bi and SHE bip placed an AB boundary after the first three samples.However, whereas SHE bip placed a boundary (between ABs 2 and 3) after the first four samples and coincident with the fall in the proportional abundance of Ammonia sp., SHE bi did not, but instead placed the succeeding AB boundary between the fifth (T1-7) and sixth (T3-12) samples.With SHE bi it is not until after data from the fifth sample has been accumulated that a sufficient number of specimens of other species have amassed to overpower the high numbers and proportions of Ammonia sp. in the fourth sample.Thus, this difference in boundary placement is due to a coupling of the dominant Ammonia sp. in sample T1-8 with the relatively small N for samples T1-7 and T3-12.Only SHE bip was able to overcome the impact of the difference in sample sizes N and delineate an AB boundary at this point.
In the preceding example, per sample N decreased markedly across the AB boundary detected using SHE pa .
A second example shows that SHE bi may also miss AB boundaries across which per sample N increases.Both SHE bip and SHE bi placed a boundary between samples T3-4 and T3-7 (between ABs 5\6 and E \F, resp.).Above this boundary, SHE bi grouped the next five samples as ABF.In contrast, SHE bip grouped the succeeding two samples T3-7 (N = 67) and T3-5 (N = 38) as AB4, and then distinguished the succeeding T3-3 (N = 355) as a separate AB7.The samples in AB6 contained means of ∼25% T. advena, ∼18% Ammotium distinctum, ∼12% T. inflate, and 12% Triloculina oblonga, together with 13%-24% Ammonia sp. and 0%-24% Trochammina inflata.The assemblage in the single sample AB7, in contrast, contained ∼60% T. advena, 18% T. oblonga and 0% each of Ammonia sp., T. inflata and A. distinctum.Wright and Hay [30] estimated that a sample size of N = 300 is needed to ensure with 95% confidence that all species with an abundance of >1.0% have been detected.Given that N in T3-3 exceeds this, it is concluded that the disappearance of Ammonia sp., T. inflata and A. distinctum from AB7 is a statistically significant phenomenon.Simultaneous confidence intervals showed that the difference in the proportional abundances of T. advena in ABs 6 and 7 were statistically significant.There thus occurred a distinct change in the assemblage between AB6 and AB7 that warrants the placement of the AB boundary between them, as given by SHE bip , even though this was not detected by SHE bi .
It might be argued that SHE bip inserts an AB boundary wherever there is a large change in per sample N. One final example demonstrates that this is not the case.Both SHE bi and SHE bip place sample T3-9 (∼11% of total recovery) within an AB with the preceding sample, despite that fact that in the underlying sample (T2-5) N was only 28 (<1% of total recovery).
The above examples demonstrate that SHE bip is useful where N varies significantly between samples.It must be stressed, however, that SHE bip is not intended to replace SHE bi , but rather to allow the recognition of AB boundaries under marginal circumstances where sample quality is poor and SHE bi cannot function fully.This ability to re-examine poor quality data is surely to be welcomed (just as medical patients with rare diseases welcome any advances made in their treatment despite it being based on studies with small sample sizes).SHE bip must not, however, be used indiscriminately and seen as a correction for SHE bi to be applied under all circumstances.Whereas the values of M, H A , S A, and E A from SHE bi can be further analysed using SHE Analysis Identification of Community Structure (SHECSIsee [18]), those from SHE bip cannot.

Conclusions
If the number of specimens in the samples taken along a line transect varies markedly, SHE bi may place an AB boundary at an unexpected position.In these cases SHE bi may be modified by using a table of proportional abundances as a starting point, the new method being termed SHE bip .Thus, abundance biozone boundaries can now be detected with confidence in situations where specimen recovery from samples is highly variable.Although SHE bip was here applied to intertidal foraminifera, it can be applied to any community in which N (per sample) fluctuates markedly.Both SHE bi and SHE bip can be conducted using spreadsheet programmes that come ready-installed on any new computer.SHE bip is especially useful in situations where the number of specimens varies markedly from sample to sample.

N:
the number of specimens picked from a sample n i : the number of specimens of the ith species in a sample p i : the proportional abundance of the ith species in a sample, n i /N M: the number of specimens in an accumulated series of samples L: the number of samples in an accumulated series of samples S: the number of species present in a single sample S A : the number of species in an accumulated series of samples H: the value of the information function for a single sample, H = −Σ p i * p i H A : the value of the information function for an accumulated series of samples.E: the value of the equitability index for a single sample, E = e H /S E A : the value of the equitability index for an accumulated series of samples SHE bi : SHE Analysis for Biozone Identification conducted using a matrix of species absolute abundances SHE bip : SHE Analysis for Biozone Identification conducted using a matrix of species proportional abundances.

Figure 2 :
Figure 2: Graphs of SHE bi and SHE bip analyses of the model data sets.(a) SHE bi applied to Model Data Set 1.(b) SHE bip applied to Model Data Set 1. (c) SHE bi applied to Model Data Set 2. (d) SHE bip applied to Model Data Set 2.

Species T1- 11 Figure 3 :
Figure 3: The location of the study area and sampled transects.(a) Trinidad, showing the location of the Caroni Swamp (asterisk).(b) The area around the mouth of the Blue River, showing the location of the three transects.

Figure 4 :
Figure 4: Per sample recovery of foraminifera as a percentage of total recovery across all samples in which N > 20.

Figure 5 :
Figure 5: The distribution of the most abundant species across the three transects, and showing the locations of the abundance biozone boundaries as revealed by SHE bi and SHE bip .

Figure 7 :
Figure 7: Plots of ln E A versus ln L from SHE bip with serial deletions of abundance biozones during the accumulation procedure.Dashed lines indicate the positions at which abundance biozone (AB) boundaries are identified.(a) Grouping of the first 4 samples as AB1.(b) Following deletion of the first four samples, grouping of the next two as AB2.Figures 7(c) through 7(i) continue this sequential procedure until all samples are accounted for.
= 1, S, H and E do not differ for that sample whether SHE bi or SHE bip is used.When a table of proportional abundances is used for SHE bip , however, L becomes 1, 2, 3, . . .,x, where x is the total number of samples accumulated.Thus, for SHE bip , ) SHE bip applied to Model Data Set 1. (c) SHE bi applied to Model Data Set 2. (d) SHE bip applied to Model Data Set 2.

Table 1 :
Two model data sets illustrating SHE bi versus SHE bip .(a) Data set in which N = 350 for all samples.(b) Data set in which N = 550 for all samples except S3, in which N = 25.

Table 2 :
Intertidal Benthonic foraminiferal recovery for samples from the Caroni Swamp in which N > 20, samples arranged in order of increasing altitude relative to annual mean sea level.The table also shows per sample values of S, H, and E.

Table 3 :
Comparison of the placing of abundance biozone boundaries when the data from Table 2 are analysed using SHE bi versus SHE bip .