You are here: ABIF home > BAT home > BAT technical background

Biodiversity Analysis Tool Technical Background

Software | Data requirements | Assumptions | Analysis | Data Completeness and quality | References

Software

The user requires only a standard web browser such as Internet Explorer, Netscape or Firefox to use BAT.

Behind the scenes, BAT runs in Perl, with data stored in an Oracle database and maps delivered using ArcIMS.

BAT has been designed to run on a wide variety of computer systems, without needing particular commercial software.

The Perl source code and database setup scripts are online and available for download as a zip file.

Data requirements

BAT is able to use an observation/specimen dataset which satisfies the following conditions:

As a demonstration, BAT operates on a pre-defined set of data sources and regions. Only the administrator has the ability to make additions or changes to the list of available data sources and regions.

top

Assumptions

The analyses used in BAT assume that the data sources used have:

The larger the grid size used, better BAT copes with data affected by location errors or gaps in spatial coverage.

top

Analysis

The first step in each of the analyses is to generate a species list for each grid square from the specimens recorded there. Each of the analyses is based on this list. The analyses do not use any measure of abundance.

Species Richness

Species richness counts the number of different species in each grid-cell. This follows the methodologies previously referred to by a number of authors (eg Crisp et al 2001; Williams et al, 1994; Williams et al, 1997).

Endemism

BAT uses a weighted endemism algorithm that creates an index of endemism using the following steps:

  1. the range size of each species is estimated as the number of grid cells occupied by records for that species in the data.
  2. an endemism score for each species is calculated as: the inverse of the range size multiplied by 100. This gives a score out of 100 where a score of 100 means a species is completely endemic to a single grid cell down to a score close to 0 where the species is very widespread.
  3. BAT identifies the species present in a given grid cell and then adds the endemism scores of these species together to give a cumulative score for the high-level taxon being examined (family or order).

This analysis, a slight variation of the methods used by others such as Williams et al (1994), calculates endemism as a continuous function of distribution range.  The main advantage of this approach is that it doesn't create an arbitrary threshold of endemism — such as species recorded only within a country, a region, or a specified range size.

Taxonomic Diversity

Taxonomic diversity in BAT is a special case of Faith’s phylogenetic diversity (PD) measure (Faith 1992, 1994). The PD measure proposed by Faith assesses the complement of taxa in a specific geographic area and calculates the minimum lengths of all of the branches in the phylogenetic tree required to include all the taxa present.  Faith argues that PD is a more effective guide for maximising feature diversity than alternatives such as node counting (Vane-Wright et al 1991).  BAT follows Faith's (1992) recommendation that in the case where actual branch lengths are not known (eg there is only a standard taxonomic tree) it be assumed that all branch lengths are equal (i.e. length = 1).

BAT uses a simple algorithm to calculate a taxonomic diversity index based on the set of taxonomic levels in the Darwin Core format for specimen records.  These levels are kingdom, phylum, class, order, family, genus, species and sub-species. Where not all of the levels are present in the source data, or the higher taxonomy is inconsistent, the administrator can apply a locally stored taxonomy to the data for family level and above.

The algorithm then counts for each grid square, the total number of branches joining those taxa in the group being analysed, which are present in a given grid square.

top

Data Completeness and Quality

BAT uses geo-referenced specimen data and does not have a data-audit capacity. This means that the analysis results generated by BAT will reflect the strengths, weaknesses and biases in the source data. There are significant potential biases associated with the use of specimen data (e.g. Crisp et al 2001, Ponder 1999). Three potential problems are outlined here:

  1. Spatial bias: Specimen data has rarely been gathered in a systematic way across a broad region. Typically, because of access restrictions or time available, specimens are collected close to roads or in areas known to yield good results, causing what has been labelled the ‘roadmap effect’ (Crisp et al 2001). Areas closer to population centres and research institutions also tend to be more densely sampled.
  2. Species bias: Specimen data is rarely collected comprehensively at a site. Researchers typically collect ‘interesting’ records and consequently, common species are typically under-represented in both in observation and specimen databases. It is our experience that this is a consistent factor in most datasets.
  3. Specimen data errors: Most specimen datasets include errors of four types, (i) errors in geo-referencing (for example latitude and longitude recorded incorrectly or reversed) (ii) errors in spelling of the taxon name, (iii) errors in identification of the specimen and (iv) inclusion of alternative names for the same taxon so a single species may appear as two or more. Our experience has been that these errors affect between 5 and 10% of most museum collections and a higher percentage of observational datasets, particularly where inexperienced observers have been heavily involved in collecting data.

It is not possible to correct spatial bias, other than by undertaking additional sampling or resorting to alternative techniques such as modelling. We contend that there are good reasons for wanting to understand the observed distribution of the biodiversity of a region without introducing the uncertainties involved in predictive modelling. BAT offers basic tools for providing this information.

Species bias can have an effect on the BAT endemism analysis by underestimating the range extent of a common but under sampled species. In our experience of working with Australian datasets, we have found species bias to be commonly present but only a serious problem in small datasets.

As mentioned elsewhere, using a larger analysis grid reduces the effect of these two types of sampling bias. Where data are of high quality and density, a fine scale grid can offer valuable information about more local patterns in biodiversity.

Specimen data errors can affect all the analysis options in BAT. In the instance of location and specimen identification errors, it is possible to use the query button in BAT when viewing the map to examine the list of specimen records contributing to the results for a particular cell (see help). This offers a means of checking the specimen data for incongruent records. Ideally data being used in BAT should first be checked for such errors.

top

References

Crisp, M.D., Laffan, S., Linder, H.P. and Monro, A. (2001). Endemism in the Australian Flora. Journal of Biogeography 28, pp. 183–198.

Faith, D.P. (1992). Systematics and conservation: on predicting the feature diversity of subsets of taxa. Cladistics 8, pp. 361–373.

Faith, D.P. (1994). Phylogenetic diversity: a general framework for the prediction of feature diversity. Systematics and Conservation Evaluation. (eds Forey, P.L., Humphries, C.J. and Vane-Wright, R.I.) pp. 251–68, Clarendon Press, Oxford.

Ponder, W.F. (1999). Using museum collection data to assist in biodiversity assessment. The Other 99%. The Conservation and Biodiversity of Invertebrates. (eds Ponder, W.F. and Lunney, D.) pp. 253–256 The Royal Zoological Society of New South Wales.

Vane-Wright, R.I., Humphries, C.J., & Williams, P.H. (1991) What to Protect - Systematics and the Agony of Choice. Biological Conservation 55, pp. 235–254.

Williams, P.H., Gaston, K.J. & Humphries, C.J. (1997). Mapping biodiversity value worldwide: combining higher-taxon richness from different groups. Proceedings of the Royal Society, Biological Sciences 264, pp. 141–148.

Williams, P.H. and Humphries, C.J. (1994). Biodiversity, Taxonomic Relatedness, and endemism in Conservation. Systematics and Conservation Evaluation (eds Forey, P.L., Humphries, C.J. and Vane-Wright, R.I.) pp. 269–287. Claredon Press, Oxford.


Top

© Commonwealth of Australia