National Vegetation Information System Taxonomic Review

National Vegetation Information System
Centre for Plant Biodiversity Research
Department of the Environment and Heritage, 2004

2.0 Methodology

The CPBR was asked to assess the extent and nature of taxonomic inconsistencies found in the NVIS Taxon_Lists dataset and what measures can be taken to address them. ERIN asked for guidance as to whether these inconsistencies could be solved by a simple name change, re- identification of herbarium vouchers or by future field assessment.

In order to do this, a structured method of assessment needed to be developed. It was decided that the structure of the database would be assessed to determine what additional database columns would be needed, these are outlined in Section 2.1. The types of annotations were standardised and written in the format outlined in Section 2.2.

The APNI/WIN database was agreed as the standard to be used for checking the currency of scientific names. Any difference in the scientific names used in NVIS as compared to APNI/WIN was commented on in the appropriate new fields.

Once completed, the results of this assessment were presented in Section 3. All taxonomic issues encountered are discussed in Section 3.1; authority problems in Section 3.2 and large group findings are discussed in Section 3.3.

2.1 Database preparation and early review

Access was set up to the Restructured NVIS database. PL/SQL Developer was used to access and annotate this dataset. It was decided in the early stages of the project that CPBR reviewers would not alter the pre-existing NVIS data. As taxonomic assessment of the Taxon_Lists dataset was required and annotations needed for records that displayed anomalies, ERIN set up two new columns within the dataset to allow for these comments. The new columns added were:


These two columns were set up to allow for preliminary assessment of records in the Taxon_Lists dataset. CPBR staff then assessed records, making annotations in the CPBR_PROBLEM field with what problem is apparent. The recommended solution to the problem was then added to the CPBR_SOLUTION field. After early analysis it was clear duplication of names was an issue and several other columns would be needed to give proper review accountability. These additional columns were:


2.2 Database annotation of immediate taxonomic issues

Once the new fields were added, CPBR staff assessed all records for taxonomic accuracy. Plant name authority was also assessed for approximately half the records. The agreed to methodology for what would be added to each of the new fields is detailed in the following subsections. A complete listing of all records and their CPBR annotations is shown in Appendix F.

2.21 MASTER column

With initial analysis of the dataset, it was apparent many of the taxa present in the list were duplicated multiple times. The Master column was set up to help quantify the extent of duplication. Single records for a taxon are left blank and considered to be unique taxon records.

Where multiple records relating to one taxon are present, one record (usually the lowest TAXDSC_ID number) is chosen as the Master record, indicated with an “M”. All other records relating to the same name are considered duplicate records and this was marked with a “D”. Duplicate records are also coupled with a reference to the Master record’s TAXDSC_ID number, which is added to the CPBR_SOLUTION field as shown below:

Eg. Refer to 12345

2.22 CPBR_PROBLEM column

Early assessment of records gave a greater understanding of the taxonomic and nomenclatural issues that were present in the Taxon_Lists dataset. The issues encountered were gradually qualified into a number of different categories. Table 2.22 below details all categories encountered. While taxonomic and nomenclatural assessment was provided for all records, the validity of plant name authors was only surveyed for selected records.

Some records had multiple problems; these were displayed in the CPBR_PROBLEM field with a forward slash separating the issues, as shown below:

Eg. Wrong family name/Name misspelt in INFRA_SPECIES

2.23 CPBR_SOLUTION column

For records that had comments added to the CPBR_PROBLEM field, a solution was in most circumstances provided in the CPBR_SOLUTION field. Most occasions where comment was added to CPBR_SOLUTION field and no comment was made in the CPBR_PROBLEM field related to duplicate records, as outlined in Section 2.21. Other occasions relate to where the solution is intuitive. Table 2.23 below shows the format for a single-issue entry, the format for a record that has multiple issues and a record that has an issue where the solution is intuitive.

Table 2.22 Summary of taxonomic and nomenclatural issues
For all records For selected records
Rank misspelt Author in wrong field
Infra_name in wrong field Author incorrect
Infra_name missing Author incorrect in INFRA_AUTHOR
Nomenclatural synonym Author incorrect in SP_AUTHOR
Taxonomic synonym Author incorrect in two fields
Alternate family name Author missing
Family missing Author missing in INFRA_AUTHOR
Name misspelt in SPECIES Author missing in SP_AUTHOR
Name misspelt in INFRA_SPECIES  
Phrase name  
Non-plant taxon  
Double epithet  
Status unknown  
s.l. qualifier present  
s.s. qualifier present  
sensu qualifier present  
sp. agg. qualifier present  
complex qualifier present  
Table 2.23 Taxonomic problem and solution relationships
Rank misspelt replace with subsp.
Rank misspelt replace with var.
Rank misspelt/Author incorrect in INFRA_AUTHOR replace with subsp./L.A.S.Johnson & K.D.Hill
Author in wrong field  

2.24 CPBR_WIN_CURR column

This field relates to the currency of scientific names as compared to the APNI/WIN database. Where the name used in a record matched that shown in APNI/WIN, a “Y” is used; where the name does not match, such in the case of synonyms, an “N” is used. For duplicate records, no listing was provided, this listing being provided for the duplicate’s master record. A “U” was listed for incomplete names and where the status of the name was unknown. Different examples for each category are shown in Table 2.25.

2.25 CPBR_WIN_NAME column

This column is directly dependent on the CPBR_WIN_CURR column (Section 2.24). NVIS names compared to the APNI/WIN database that are found to be non-current have an “N” placed in the CPBR_WIN_CURR field. Where a name is found to be non-current, the current name, as used in APNI/WIN, is added to the CPBR_WIN_NAME field. NVIS names that are found to be current have a “Y” listed in the CPBR_WIN_CURR field and consequently have no need for a name added to the CPBR_WIN_NAME field. Table 2.25 gives examples of NVIS names and how their name currency is displayed.

Table 2.25 Name currency
Xanthorrhoea glauca       Y  
Syncarpia glomulifera subsp. glomulifera   Y  
Acacia burkittii     Nomenclatural synonym N Acacia acuminata subsp. burkittii
Elymus scabrus var. scabrus Name misspelt N Elymus scaber var. scaber
Acacia myrtifolia var.   Infra_name missing U  
Vittadinia sp.     Status unknown U  

2.26 CPBR_CHECKED_BY column

This column is where the initials of the three CPBR staff involved in record checking were recorded. This field provided assurance that a record had been checked even in cases where no annotations were required.

2.27 CPBR_CHECKED column

This column links with CPBR_CHECKED_BY and relates to the date the record was checked.

2.3 Structural review of the NVIS database

2.31 NVIS Taxon_Lists structural assessment

The second part of the NVIS taxonomic review by the CPBR was to provide an assessment of the structure of the Taxon_Lists dataset. It was agreed that the assessment of immediate taxonomic issues as discussed in Sections 2.1 and 2.2 would need to be completed before this could happen. Once these results were completed (see Section 3), meaningful comment on the database structure is possible. These comments will be presented in Section 4.1.

The main aim of the structural review is to assess the existing Taxon_Lists architecture and the data content. Assessment was expected to result in the recommendation of the removal of some existing columns or possibly the addition of new ones. It was decided that recommendations on the structure and content was too difficult to separate and they would be dealt with together in Section 4.1.

ERIN as part of this specifically requested an examination of the costs and benefits of identifying and recording to infraspecies level for vegetation surveys. This will be discussed with the recommendations made on the INRFA_SPECIES column in Section 4.12.

2.32 Interlinking NVIS with other databases

ERIN also sought comment to the role of other botanical databases play in relation to NVIS. The CPBR was asked to review the technical options for cooperative linkages between other national and/or Commonwealth databases such as such as the Australia’s Virtual Herbarium (AVH), APNI/WIN and the SPRAT databases. All these databases have different strengths and emphases that could potentially add value to the NVIS database. Section 4.21 discusses the potential linkages between these databases..National Vegetation Information System Taxonomic Review

2.33 State jurisdictional lists

Separate State jurisdictional lists are maintained outside of the NVIS database. These lists typically hold many more species than are used in NVIS vegetation descriptions. ERIN sought comment as to how these lists should be maintained to support updates of NVIS descriptions. This discussion is presented in Section 4.22.

For many plant groups differences of taxonomic opinion exist between botanical institutions. These institutions provide the data that combined, forms the NVIS database. Part of the review of this State data was to examine whether the Commonwealth needs to maintain a separate list of these unresolved taxonomic issues, to ensure the operational capability of NVIS. The issue of finding consensus on these issues will also be discussed in Section 4.22.

2.4 Development of test measures for future NVIS assessment

The CPBR review outlined in Section 2.2 was a one off assessment of the Taxon_Lists dataset. While comprehensive in its assessment, the NVIS database is periodically updated with new data provided by the States. With these updates in mind, the CPBR was asked to develop ways for ERIN managers to periodically assess the quality of taxonomic data.

Periodic assessment would help keep the Taxon_Lists dataset taxonomically and nomenclaturally current. The frequency of these assessments was also to be examined. This frequency is very much tied in with the structural review discussed in Section 4.1 and the interlinking of NVIS to the other national databases as discussed in Section 4.21.

The recommendations developed by the CPBR relating to periodic test measures are discussed in Section 4.3.

2.5 Development of guidance material for NVIS collaborator manuals

The final advice sought from the CPBR related to the provision guidance material for incorporation in future NVIS collaborator manuals. These manua ls provide recommendations to State data contributors, helping them to provide botanical records that are consistent with other contributors.

The CPBR was asked to provide comment on the current standards used for data entry and interchange of botanical information. The Herbarium Information Standards and Protocols for Interchange of Data (HISPID) are seen by ERIN to have particular relevance to NVIS and comment on these standards is invited.

Similarly, guidance material was requested relating to the best practice for the taxonomic aspects of vegetation survey, especially the collection of herbarium vouchers. Recommendations relating to future collaborator manuals and collection techniques are detailed in Section 4.4.