The Spotted Handfish 1999-2001 Recovery Plan

Caution: archived content

This content may have been superseded, or served a particular purpose at a particular time. It may contain references to activities or policies that have no current application. Many archived documents may link to web pages that have moved or no longer exist, or may refer to other documents that are no longer available.

BD Bruce and MA Green
Spotted Handfish Recovery Team, March 1998
ISBN 0 643 061657

Appendix A

Power of tests for changes in density of handfish

For Mark Green and Barry Bruce, CSIRO Division of Marine Research, Hobart

By Kathy Haskard, CSIRO Mathematical and Information Sciences

Background

You have performed a survey in a small specified area near Hobart in order to estimate the density and abundance of handfish (B. hirsutus). You will repeat this survey, possibly annually, or more often if this seems desirable, in order to examine whether the density changes. With the information so far obtained, you wish me to estimate power for detecting changes in density.

Survey Design

Following previous discussions, your survey design identifies an approximately rectangular region, with one long side at approximately the 5 m depth contour, and extending out at right angles to near or beyond the 10 m contour. Twenty-five equally spaced parallel transect lines were described, running across the short dimension of the rectangle, so running from depth 5 m to 10 m. For an individual survey 12 of these were selected at random, and a strip 2 m wide, centred on each selected transect, was examined by two divers swimming together, each carrying a 1 m pole to delimit their search area. Handfish seen greater than 1 m either side of the transect line were not included in the estimates.

Rather than covering a true rectangle, transects were defined to be between the 5 m depth contour, or beyond this (deeper than 5 m) if a rocky reef is present, since this is believed to be unsuitable habitat for handfish, and the 10 m depth contour, so that the area covered by the survey is not a true rectangle, and the transects have unequal lengths.

For the purpose of calculating an index of abundance, it would be best to examine the same transects at each repeated survey. However for simply obtaining the best overall estimate of abundance (for example if the density did not change over time) it is desirable to randomly select the transects separately each survey, so that a bigger part of the region is ultimately examined. The procedure used here is a compromise between these two. There will be partial overlap between surveys.

Estimated density and standard error

From a previous consultation, I provided you with the following formulae. Because the twelve individual transects are of different lengths, I recommended a weighted average, with the transect lengths as weights.

Suppose there are k transects, transect i has length li metres, and ni handfish are observed on transect i, i=1...k. You wish to estimate D = average density of handfish in the survey area. Let L be the total of the lengths of the k (=12) transects. Each transect i covers area 2li square metres, and the total area sampled is 2L. The estimate of D is the weighted average

Part of Formulae

Part of Formulae

where N = total number of handfish seen, Part of Formulae, and Part of Formulae.

The estimated variance of Part of Formulae is

Part of Formulae

Part of Formulae

The standard error (SE) of Part of Formulae is simply the square root of the variance.

This will give estimated density in units of numbers per square metre, which will be very small numbers to work with. You might prefer to use, for example, numbers per 100 m2. To do this, simply multiply the estimated density and its SE by 100, and multiply variance by 10,000.

Your question

Having performed the survey once and done the above calculations, you have asked me to estimate the power of tests to determine if handfish abundance has significantly changed from one time to another. As I explained at our meeting, for a particular statistical test, power depends on several inter-related things:

  • the significance level at which the test will be declared significant, commonly 0.05
  • the variability within the data, e.g. between transects
  • sample sizes, in this case number of transects or total area of transects
  • the size of the true difference
  • the probability with which a test will be significant given a difference of that size - this is the power of the test. Power of around 80% or more is generally regarded as useful.

Less variability, greater sample sizes, and larger true differences all increase the power of a test, as does a larger significance level a, e.g. a=0.10. We will consider only a=0.05. The more variable data are, the greater the sample size will need to be, or the larger the underlying difference would have to be, to achieve a given power. This is because with more variability in the data, we need more evidence, or a bigger effect, to have reasonable confidence that an observed difference is not simply due to chance.

What test will we use?

Before we can estimate power, we must decide what statistical test will be used. If data can be assumed to approximately follow a normal distribution, this is usually straightforward.. However, the data from your first survey using this design reveal very low numbers of handfish, between 0 and 3 per transect, with a total of 15 handfish from the 12 transects of total length 1506 m, covering area 3012 m2, giving estimated density of 0.00498 handfish m-2 with variance 1.412*10-6, and SE 0.001188 handfish m-2.

Converting to more convenient, though unconventional, units, this is 49.8 handfish per hectare (10000 m2), with variance 141.2 and SE 11.88.

The study area to which this applies has size 66,270 m2, giving an estimated abundance of 330 handfish, with estimated standard error 70.

With such small numbers, we are counting occurrences of a rare event, and the Poisson distribution may be appropriate here. This assumes that the fish are essentially independently located, with no clustering nor the opposite which would lead to more uniform spreading out of the individuals. This appears to be a reasonable assumption in this case, and it enables us to form a test.

We assume that the number observed on a given transect has a Poisson distribution with mean equal to the overall density times the area of that transect, i.e. 2liD. Assuming each transect is independent, this would imply the total number of handfish N would have a Poisson distribution with mean 2LD. A property of the Poisson distribution is that the variance is equal to the mean, i.e. var(N) should equal 2LD, and the variance of the density estimate Part of Formulae=N/(2L) would be var(N)/(2L)2 = D/(2L). Using our estimate Part of Formulae = 0.00498 we obtain Part of Formulae =1.653*10-6 as the estimated variance of Part of Formulae, compared with 1.412*10-6 calculated directly from the weighted average, without using the Poisson assumption. These are very consistent, and support the Poisson assumption.

Thus, I proceeded on the assumption that the total number N of handfish seen in traversing transects of lengths totalling L follows a Poisson distribution with mean 2LD. We now must devise a procedure to test whether two surveys suggest a difference in density D between them. Two different approaches lead to the same test.

Derivations of test

One approach is to simply estimate the expected number of handfish seen in each of two surveys, given the total of the two surveys, proportionately according to the total transect lengths in each, and compare these to the actual numbers in each survey, using a well-known chi-square test,

Part of Formulae

where the sum is over two cells corresponding to the two surveys. Provided the two expected values are not too small (greater than 5 is sufficient) X2 has approximately a c2 distribution on 1 df.

Given total numbers N1 and N2 seen in the two surveys respectively, and total transect lengths L1 and L2 in the two surveys respectively, and writing N = N1+N2 and L=L1+L2, under the null hypothesis that both densities are the same the estimated common density is Part of Formulae. Then, given the total N, the expected number seen for survey 1 is Part of Formulae, and for survey 2 the expected number is Part of Formulae. (Observed-expected)2 is the same for both cells, Part of Formulae

and

Part of Formulae

and in the special case when L1 = L2, this simplifies to

Part of Formulae

For example, suppose the next survey found a total of 10 handfish in transects totalling 2000 m in length, giving an estimated density of 10/(2*2000) = 0.0025 handfish m-2, half that of survey 1. Would we conclude the density had changed from the previous survey with estimated density 15/(2*1506) = 0.00498? Given a total of 15+10=25 handfish, and total transect lengths 1506 and 2000 m, totalling 3506 m, if the densities were the same we would have expected to observe 25*1506/3506 = 10.74 handfish in the first survey and 25*2000/3506 = 14.26 handfish in the second survey. Compared with the observed values of 15and 10 respectively, we obtain X2 = 4.262/10.74 + 4.262/14.26 = 2.964 and comparing this to a chi-square distribution with 1 df, we obtain p = 0.085, not significant at the 5% level.

The alternative approach uses the fact that a Poisson distribution with sufficiently large mean can be approximated by a normal distribution with mean and variance equal to the Poisson mean. So writing N1 and N2 for the total numbers seen in surveys 1 and 2 respectively, and L1 and L2 for the total length of transects in the two surveys respectively, we have

N1 ~ approximately N(2L1D1, 2L1D1),

N2 ~ approximately N(2L2D2, 2L2D2)

and hence

Part of Formulae ~ approximately N(D1, D1/ (2L1) ),

Part of Formulae~ approximately N(D2, D2/ (2L2) ),

and

Part of Formulae ~ approximately N(D1 - D2, Part of Formulae),

which provides an approximate test based on the normal distribution, namely that under the null hypothesis H0: D1 = D2 = D,

Part of Formulae ~ approximately N(0 , Part of Formulae),

Part of Formulae ~ approximately N(0 , 1 ).

This would be our ideal test statistic. However, we do not know the true D, and we substitute our best estimate Part of Formulae, giving our test statistic

Part of Formulae ~ approximately N(0 , 1 ) (a poorer approximation than for Z).

Now the square of standard unit normal variable has a chi-square distribution, and the square of T is

Part of Formulae

which can be shown to be algebraically equal to X2 in the first approach above (my notes, 28/10/97, workbook p. 55), with the approximate c2 distribution with 1 d.f. Both these expressions can be written more directly for calculation, based on numbers of handfish seen and transect lengths, as

Part of Formulae

Note that if L1 = L2, this simplifies even further to

Part of Formulae

and the corresponding T-statistic becomes

Part of Formulae

The test

To test H0: D1 = D2 against the alternative hypothesis H1:D1 ¹ D2, calculate

Part of Formulae

and compare with a c2 distribution with 1 d.f., i.e. declare significant at the 5% level if X2 is greater than 3.84. This gives a two-sided test, i.e. you are looking for differences and don't care whether they are increases or decreases in density.

For a one-sided test, e.g. if you don't believe density would ever increase (could be a dangerous assumption), and are only concerned with detecting a decrease, test H0: D1 £ D2 against the alternative hypothesis H1:D1 > D2, calculate

Part of Formulae

and compare to a standard normal distribution, rejecting H0 only if T is large and positive. That is, reject H0 in favour of H1 (conclude that the density has decreased from survey 1 to survey 2) at the 5% level if T is greater than 1.645. This will be a stronger test if you are happy to neglect the possibility of increases in density.

If you reject when T is bigger than 1.96 or smaller than -1.96, then this is equivalent to the two-sided test above (because X2 = T2 and 3.84 = 1.962).

Calculating power

We ask: for a given change in density, and using the information we have from the first survey which gives us some information about variability of these data, what is the probability of rejecting the null hypothesis that the densities are equal at each time?

Some further assumptions must be made. In particular, we don't know in advance what the total length of transects in the next survey will be. We would expect it to be similar to L1, and calculate the power on this basis. We must also decide how we will specify a change in density. It is convenient to express the D2 as a fraction or multiplier of D1. I.e. suppose D2 = c D1, where c ³ 0 and c = 1 for no change, c 1 for an increase.

The question is: if D2 = c D1, for a specified value of c, what is the probability that X2 will be greater than 3.84?

It is convenient to work through the theory again, beginning with the assumed Poisson distributions for N1 and N2 and using L1 = L2. We assume that N1 ~ Poisson with mean m1 and N2 ~ Poisson with mean m2. Since L1=L2, m1 = 2L1D1 and m2 = 2L1D2 = 2L1cD1 = c m1. Then we say

N1 ~ approximately N(m1, m1) and N2 ~ approximately N(c m1, c m1) and hence

N1 - N2 ~ approximately N(m1(1-c), m1(1+c) ), and

Part of Formulae ~ approximately normally distributed with mean Part of Formulae and variance 1.

Z is not exactly the test statistic used, rather we use T which has N1 +N2 in the denominator as an approximation to m1(1+c). However, it is difficult or impossible to obtain the distribution of T, so for calculating power we use this 'idealised' test statistic Z. Further, in the expression for the mean of Z, we see m1, which we do not know , so we calculate power using N1 as a substitute for m1 in the mean, another approximation.

This enables us to calculate the power for any specified value of c.

Probability of rejecting H0 is

= probability that T > 1.96 or T

@ probability that Z > 1.96 or Z

= probability that Part of Formulae or Part of Formulae

= probability that a N(0,1) variable is Part of Formulae or Part of Formulae

@ probability that a N(0,1) variable is Part of Formulae or Part of Formulae

Power results

We can calculate these probabilities for a range of values of c. I have done this in a Microsoft Excel spreadsheet which I can provide to you. Some results are presented in Table 1. The formulas are shown in Table 2. All tests are assumed to be at the 5% significance level.

Recall that c = 1 means no change in true density between survey 1 and survey 2; c = 0.5 means density in survey 2 was half that of survey 1; and c = 1.333 means the density increased by one third. When c = 1 the power is 0.05, as we would expect with a test at the 5% significance level - if there is truly no difference, there is a 5% chance of declaring a significant difference.

Table 1. Power of two-sample 5%-level test of equal densities vs densities not equal, and for one-sided 5% tests. We assume that D2 = cD1, and m1 is the expected number of handfish seen in survey 1. For this table m1 is set to the estimate 15, the number observed in the first survey.

For the two-sided test, upper tail is power when Part of Formulae, Part of Formulae, and lower tail is power when Part of Formulae, Part of Formulae. Total power is the sum of these two.

Two columns show power for the one-sided test. The first applies when density decreases between surveys 1 and 2 (0 £ c , and the second is for one-sided tests when the density increases between surveys 1 and 2 (c > 1), Part of Formulae.

In all these expressions, Z refers to a standard unit normal variable, and all tests are assumed to be at the 5% significance level. Note that when there is no difference in density (c = 1, shaded cells), power equals significance level, 0.05.

m1 = 15

 

Two-sided

test

One-sided

tests

c

Upper tail

Lower tail

Total power

Testing vs D21

Testing vs D2 > D1

0

0.972

0.000

0.972

0.987

0.000

0.1

0.887

0.000

0.887

0.936

0.000

0.2

0.733

0.000

0.733

0.826

0.000

0.3

0.550

0.000

0.550

0.670

0.000

0.4

0.382

0.000

0.382

0.506

0.000

0.5

0.252

0.001

0.252

0.362

0.002

0.6

0.161

0.002

0.162

0.249

0.004

0.7

0.101

0.004

0.105

0.168

0.010

0.8

0.063

0.008

0.071

0.112

0.019

0.9

0.040

0.015

0.055

0.075

0.032

1

0.025

0.025

0.050

0.050

0.050

1.1

0.016

0.038

0.054

0.034

0.072

1.2

0.010

0.054

0.064

0.023

0.098

1.3

0.007

0.073

0.080

0.016

0.127

1.4

0.005

0.094

0.099

0.011

0.159

1.5

0.003

0.118

0.121

0.008

0.192

1.6

0.002

0.143

0.145

0.006

0.226

1.7

0.002

0.170

0.171

0.004

0.261

1.8

0.001

0.197

0.198

0.003

0.295

1.9

0.001

0.224

0.225

0.002

0.329

2

0.001

0.252

0.252

0.002

0.362

2.1

0.000

0.279

0.279

0.001

0.393

2.2

0.000

0.306

0.306

0.001

0.424

2.3

0.000

0.332

0.332

0.001

0.453

2.4

0.000

0.357

0.358

0.001

0.480

2.5

0.000

0.382

0.382

0.000

0.506

2.6

0.000

0.406

0.406

0.000

0.530

2.7

0.000

0.428

0.428

0.000

0.553

2.8

0.000

0.450

0.450

0.000

0.575

2.9

0.000

0.471

0.471

0.000

0.596

3

0.000

0.491

0.491

0.000

0.615

3.1

0.000

0.509

0.510

0.000

0.633

3.2

0.000

0.527

0.527

0.000

0.649

3.3

0.000

0.544

0.544

0.000

0.665

3.4

0.000

0.561

0.561

0.000

0.680

3.5

0.000

0.576

0.576

0.000

0.694

3.6

0.000

0.591

0.591

0.000

0.707

3.7

0.000

0.604

0.604

0.000

0.719

3.8

0.000

0.618

0.618

0.000

0.730

3.9

0.000

0.630

0.630

0.000

0.741

4

0.000

0.642

0.642

0.000

0.751

5

0.000

0.733

0.733

0.000

0.826

6

0.000

0.790

0.790

0.000

0.869

7

0.000

0.828

0.828

0.000

0.896

Table 2. Microsoft Excel formulas used for Table 1.

Col

Row

A

B

C

D

E

F

G

1

m1=

15

Two-sided

test

     

2

 

c

Upper tail

Lower tail

Total Power

   

3

 

0.5

=1-NORMSDIST(1.96-(1-B3)*SQRT(B$1)/(1+B3))

=NORMSDIST(-1.96-(1-B3)*SQRT(B$1)/(1+B3))

=SUM(C3:D3)

..

..

4

 

2.0

=1-NORMSDIST(1.96-(1-B4)*SQRT(B$1)/(1+B4))

=NORMSDIST(-1.96-(1-B4)*SQRT(B$1)/(1+B4))

=SUM(C4:D4)

   

Col

Row

A

B

C

D

E

F

G

1

m1=

15

     

One-sided

tests

2

 

c

     

Testing D2 = D1 vs D21

Testing D2 = D1 vs D2 > D1

3

 

0.5

..

..

..

=1-NORMSDIST(1.645-(1-B3)*SQRT(B$1)/(1+B3))

=NORMSDIST(-1.645-(1-B3)*SQRT(B$1)/(1+B3))

4

 

2.0

     

=1-NORMSDIST(1.645-(1-B4)*SQRT(B$1)/(1+B4))

=NORMSDIST(-1.645-(1-B4)*SQRT(B$1)/(1+B4))

Table 1 also shows power for one-sided tests. The first column is for a test of H0: D2 ³ D1 vs H1: D21 (sensible if you only expect density to decrease, i.e. only expect values of c £ 1) and the second is for tests of H0: D2 £ D1 vs H1: D2 > D1 (sensible if you only expect density to increase, i.e. only expect values of c ³ 1).

Discussion

You can see that the power is quite low unless there are massive changes in density. This is partly because so few handfish were observed. With such small numbers, seeing just a few individuals less or more could happen quite easily, but could produce quite different density estimates. In other words, the density estimates are not very precisely estimated with such small numbers of handfish seen. If you don't know densities very precisely, you can't be very confident that two densities are different; thus power is low.

It is commonly accepted that powers of 80% or more are desirable. To reach this power with the current level of sampling (area covered by transects), your density would have to decrease to less than one sixth of the density at survey 1, or increase more than 6-fold.

To increase power you would need to increase the area you sample. Table 3 shows some other examples of power if m1 was 30 (obtained by approximately doubling the area covered by transects), 45, 60, 75, 90 or 105, rather than 15. Clearly very much more effort is required to substantially improve the power, i.e. so that moderate changes in density have a good chance of being detected. Even to have a good chance of detecting a 50% change (c=0.5 or c=2) in density from its current estimated value, you would require about 5 times the sampling area, to get m1, the expected number of individuals seen, = 75.

Bear in mind that these power calculations are only approximate. Although you observed N1 = 15 in the first survey, the true density and hence m1 may have been larger or smaller than indicated by this. If larger, the power will be a little greater, if smaller the power will be even less.

Table 3. The same information as Table 1, but with larger values of m1.

m1 = 30

 

Two-sided

test

One-sided

tests

c

Upper tail

Lower tail

Total power

Testing vs D21

Testing vs D2 > D1

0

1.000

0.000

1.000

1.000

0.000

0.1

0.994

0.000

0.994

0.998

0.000

0.2

0.955

0.000

0.955

0.978

0.000

0.3

0.839

0.000

0.839

0.904

0.000

0.4

0.651

0.000

0.651

0.759

0.000

0.5

0.447

0.000

0.447

0.572

0.000

0.6

0.277

0.000

0.278

0.391

0.001

0.7

0.160

0.002

0.162

0.249

0.005

0.8

0.088

0.005

0.093

0.150

0.012

0.9

0.047

0.012

0.060

0.087

0.027

1

0.025

0.025

0.050

0.050

0.050

1.2

0.007

0.072

0.079

0.016

0.126

1.4

0.002

0.148

0.150

0.005

0.232

1.6

0.001

0.243

0.244

0.002

0.352

1.8

0.000

0.346

0.347

0.001

0.468

2

0.000

0.447

0.447

0.000

0.572

2.2

0.000

0.537

0.537

0.000

0.659

2.4

0.000

0.616

0.616

0.000

0.729

2.6

0.000

0.682

0.682

0.000

0.785

2.8

0.000

0.737

0.737

0.000

0.829

3

0.000

0.782

0.782

0.000

0.863

3.5

0.000

0.861

0.861

0.000

0.919

4

0.000

0.908

0.908

0.000

0.950

m1 = 45

 

Two-sided

test

One-sided

tests

c

Upper tail

Lower tail

Total power

Testing vs D21

Testing vs D2 > D1

0

1.000

0.000

1.000

1.000

0.000

0.1

1.000

0.000

1.000

1.000

0.000

0.2

0.994

0.000

0.994

0.998

0.000

0.3

0.951

0.000

0.951

0.975

0.000

0.4

0.820

0.000

0.820

0.891

0.000

0.5

0.609

0.000

0.609

0.723

0.000

0.6

0.389

0.000

0.389

0.513

0.000

0.7

0.219

0.001

0.220

0.322

0.002

0.8

0.112

0.003

0.116

0.184

0.008

0.9

0.054

0.010

0.064

0.098

0.023

1

0.025

0.025

0.050

0.050

0.050

1.2

0.005

0.088

0.094

0.012

0.150

1.4

0.001

0.200

0.201

0.003

0.299

1.6

0.000

0.340

0.340

0.001

0.461

1.8

0.000

0.483

0.483

0.000

0.607

2

0.000

0.609

0.609

0.000

0.723

2.2

0.000

0.711

0.711

0.000

0.808

2.4

0.000

0.789

0.789

0.000

0.868

2.6

0.000

0.846

0.846

0.000

0.909

2.8

0.000

0.888

0.888

0.000

0.937

3

0.000

0.918

0.918

0.000

0.956

3.5

0.000

0.961

0.961

0.000

0.981

4

0.000

0.981

0.981

0.000

0.991

Table 3 continued. Power for two- and one-sided tests at 5% significance level, with different expected number m1 of handfish seen in survey 1.

 

m1 = 60

     

m1 = 75

   

c

Two-sided

One-

sided

c

Two-sided

One-

sided

0

1.000

1.000

0.000

0

1.000

1.000

0.000

0.1

1.000

1.000

0.000

0.1

1.000

1.000

0.000

0.2

0.999

1.000

0.000

0.2

1.000

1.000

0.000

0.3

0.986

0.994

0.000

0.3

0.997

0.999

0.000

0.4

0.913

0.953

0.000

0.4

0.960

0.981

0.000

0.5

0.733

0.826

0.000

0.5

0.823

0.893

0.000

0.6

0.491

0.615

0.000

0.6

0.581

0.698

0.000

0.7

0.277

0.390

0.001

0.7

0.333

0.454

0.001

0.8

0.138

0.216

0.006

0.8

0.161

0.247

0.005

0.9

0.069

0.108

0.020

0.9

0.074

0.117

0.018

1

0.050

0.050

0.050

1

0.050

0.050

0.050

1.2

0.108

0.009

0.173

1.2

0.123

0.008

0.196

1.4

0.252

0.002

0.362

1.4

0.303

0.001

0.420

1.6

0.432

0.000

0.557

1.6

0.515

0.000

0.638

1.8

0.600

0.000

0.715

1.8

0.697

0.000

0.797

2

0.733

0.000

0.826

2

0.823

0.000

0.893

 

m1 = 90

     

m1 = 105

   

c

Two-sided

One-

sided

c

Two-sided

One-

sided

0

1.000

1.000

0.000

0

1.000

1.000

0.000

0.1

1.000

1.000

0.000

0.1

1.000

1.000

0.000

0.2

1.000

1.000

0.000

0.2

1.000

1.000

0.000

0.3

0.999

1.000

0.000

0.3

1.000

1.000

0.000

0.4

0.982

0.992

0.000

0.4

0.992

0.997

0.000

0.5

0.885

0.935

0.000

0.5

0.927

0.962

0.000

0.6

0.660

0.766

0.000

0.6

0.726

0.820

0.000

0.7

0.388

0.512

0.000

0.7

0.440

0.565

0.000

0.8

0.184

0.277

0.003

0.8

0.207

0.306

0.003

0.9

0.079

0.126

0.016

0.9

0.084

0.134

0.014

1

0.050

0.050

0.050

1

0.050

0.050

0.050

1.2

0.139

0.006

0.217

1.2

0.154

0.005

0.238

1.4

0.353

0.001

0.475

1.4

0.401

0.000

0.525

1.6

0.591

0.000

0.707

1.6

0.657

0.000

0.764

1.8

0.774

0.000

0.857

1.8

0.833

0.000

0.900

2

0.885

0.000

0.935

2

0.927

0.000

0.962

Confidence intervals for density and abundance

Another indication of the degree of precision with these low numbers of handfish is given by obtaining a confidence interval for the survey 1 density. We can use either (a) a normal approximation based on the Poisson distribution for N1, CI for m1 is N1 ± 1.96Ö N1, and D = m1/2L1, or (b) a t-distribution with 11 degrees of freedom, using the weighted mean of the twelve transect densities and the corresponding variance, which does not require the Poisson assumption CI for D is Part of Formulae. Both approaches give very similar results. Recall estimated mean density was 0.00498; estimated abundance was 330.

Approximate 95% confidence interval for density at the time of survey 1:

by method (a)0.00246 to 0.00750

by method (b) 0.00236 to 0.00760

Approximate 95% confidence interval for abundance in the survey region of 66,270 m2, at the time of survey 1:

by method (a)163 to 497

by method (b) 157 to 503

Exact Confidence intervals for a Poisson mean

Exact confidence intervals for the mean of a Poisson distribution, given a single observation N, can be obtained very easily using a link between the c2 distribution and the Poisson distribution, which says that, if N is distributed as a Poisson variable with mean m, then for positive integers x,

Pr(N ) = Pr(a c2 variable with 2x degrees of freedom is > 2m).

Reference: Armitage, P. and Berry, G. (1987) Statistical methods in medical research, 2nd edition. Blackwell, Oxford.

Then a 100(1- a)% confidence interval for the mean of a Poisson distribution, given a single observation N, is between

mL = (1/2)*(lower 100a/2 % point of c2(2N) distribution)

mU = (1/2)*(upper 100a/2 % point of c2(2N+2) distribution)

The formulas for use in Microsoft Excel, for 95% confidence intervals, i.e. a = 0.05, are

mL = CHIINV(0.975,2*N)/2

mU = CHIINV(0.025,2*N+2)/2

(Note that the CHIINV function gives upper 'tails' of the c2 distribution.)

On the next page are calculated values for a range of possible N values, with corresponding confidence intervals for the abundance, based on area sampled = 2*1506 m2 and total area 66270 m2. I have also shown the percentage below and above the estimated mean for this exact method (e) and method (a) of the previous section.

Again you can see that quite large numbers are required to obtain a reasonably narrow confidence interval. You may wish to experiment with using 90% confidence intervals (a =0.10).

Provided the Poisson assumption is reasonable, these exact confidence intervals are best, although there is little difference between all three in this case. If we do not wish to assume the Poisson distribution, we cannot so easily predict the width of confidence intervals for future surveys with different N, because we need to assume something about how the variance of N, the number observed, changes as N (or, strictly, its expected value or mean) changes. That is, for confidence intervals by method (b) we cannot produce a table such as given overleaf for methods (e) and (a), without making some extra assumption.

We certainly cannot assume that the variance of N would be constant if the mean of N changes, and simply use the estimated variance of 1.412 x 10-6 for estimated density. We would expect the variance of N to increase for larger expected values of N. The Poisson assumption automatically answers this question for us; it says the variance of N is proportional (in fact equal) to the mean of N. This means that the standard error as a proportion of the expected mean of N would decrease as N increased, by the square root, SE(N)/E[N] = Part of Formulae, so that with larger N, we get narrower confidence intervals, and hence more precise estimates, relative to the size of the mean.