[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SGML transfer format for discussion



Kate,

Thanks for the good start on this important topic.

I have reformatted the example SGML file that you sent so that it is
human readable. 

SGML parsers are programs that read these files and build data structures
out of them. These parsers can ignore leading whitespace. They also don't
care whether the closing tags are on a line by themselves or if they are
at the end of a line.

So we should format these documents so that we can read them. We can't
afford to have any ways to frighten off potential metadata filler-outers.

You said:
>SGML is designed primarily for exchanging and not for direct viewing by 
>the user.

No, no. I view these files all the time. Many people create HTML and
SGML files using a simple text editor. So it must be readable. Please
also think of the people who load such data to databases. When something
goes wrong they have to look at the actual data to see where the
problem is.

You will see that I have indented the tag name whenever a new tag is
opened. I think that this helps a lot as it now gives some idea of the
structure of the metadata.

This structure also enables us to refer to a particular element by
concatentating all of the names of its parent elements.

So I can say that I think that the structure of the repeating elements
 ANZMETA_DESCRIPT_THEMEKEY
seems a little inconsistent. Shouldn't all of the THEMEKEY elements be
encompassed by another element called, say KEYWORDS.

regards
David Crossley <crossley@ozemail.com.au>

--------------- snip here ---------------
<ANZMETA>
  <CITEINFO>
    <UNIQUEID>ANZCW0301000001</UNIQUEID>

    <TITLE>
      AVHRR NDVI fortnightly series covering continental
      Australia at full resolution.
    </TITLE>

    <ORIGIN>
      <CUSTOD>Bureau of Meteorology</CUSTOD>
      <JURISDIC>Australia</JURISDIC>
    </ORIGIN>
  </CITEINFO>

  <DESCRIPT>
    <ABSTRACT>
      <PARAGRPH>
        AVHRR NDVI (Normalized Difference Vegetation Index) 
        fortnightly series covering continental Australia at a 1 
        kilometre resolution. NDVI is a measure of the absorption 
        of red light by plant chlorophyll and the reflection of 
        infrared radiation by water -filled leaf cells. Its 
        values broadly measures the density of active foliage.
      </PARAGRPH>
    </ABSTRACT>

    <THEMEKEY>AGRICULTURE Biodiversity</THEMEKEY>
    <THEMEKEY>ATMOSPHERE Management</THEMEKEY>
    <THEMEKEY>ATMOSPHERE Pressure Monitoring</THEMEKEY>
    <THEMEKEY>CLIMATE AND WEATHER Mapping</THEMEKEY>
    <THEMEKEY>HAZARDS Pests</THEMEKEY>

    <DSGPOLYO>
      <LONG>112.5</LONG>
      <LAT>-10</LAT>
      <LONG>154</LONG>
      <LAT>-10</LAT>
      <LONG>154</LONG>
      <LAT>-44</LAT>
      <LONG>112.5</LONG>
      <LAT>-44</LAT>
      <LONG>112.5</LONG>
      <LAT>-10</LAT>
    </DSGPOLYO>
  </DESCRIPT>

  <BOUNDING>
    <NORTHBC>-10</NORTHBC>
    <SOUTHBC>-44</SOUTHBC>
    <EASTBC>154</EASTBC>
    <WESTBC>112.5</WESTBC>
  </BOUNDING>

  <TIMEINFO>
    <BEGDATE>
      <DAY>01</DAY>
      <MONTH>Apr</MONTH>
      <YEAR>1991</YEAR>
    </BEGDATE>

    <ENDDATE>
      Current
    </ENDDATE>
  </TIMEINFO>

  <STATUS>
    <PROGRESS>
      In Progress
    </PROGRESS>

    <UPDATE>
      Monthly
    </UPDATE>
  </STATUS>

  <DISTINFO>
    <NATIVE>
      DIGITAL Unsigned 8 bit Generic Binary
    </NATIVE>

    <AVLFORM>
      DIGITAL GIF Image
    </AVLFORM>

    <ACCSCONS>
      Environment Australia internal use only.
    </ACCSCONS>
  </DISTINFO>

  <DATAQUAL>
    <LINEAGE>
      <PARAGRPH>
        These are the basic datasets received from Marine
        laboratories.
      </PARAGRPH>
    </LINEAGE>

    <POSACC>
      <PARAGRPH>
        Positional error should not exceed 1km for the vast 
        majority of pixel centres.
      </PARAGRPH>
    </POSACC>

    <ATTRACC>
      <PARAGRPH>
        As determined by CSIRO processing of spectral response
        and computation of NDVI, see 'AVHRR DOCUMENTATION' heldby
        Neil Freeman (ERIN).
      </PARAGRPH>
    </ATTRACC>

    <LOGIC>
      <PARAGRPH>
        The method of aggregation employed selects the maximum 
        fortnightly value for output to each monthly pixel, so as 
        to minimize atmospheric contamination - thus the monthly 
        pixel values are cloud free.
      </PARAGRPH>
    </LOGIC>

    <COMPLETE>
      <PARAGRPH>
        All datasets are spatially complete - there are no 
        missing sections. The data is temporally incomplete - 
        March to November 1994 are missing.
      </PARAGRPH>
    </COMPLETE>
  </DATAQUAL>

  <CNTINFO>
    <CNTORG>
      Environmental Resources Information Network (ERIN), 
      Environment Australia
    </CNTORG>

    <CNTPOS>
      Scientific Coordinator - Remote Sensing
    </CNTPOS>

    <ADDRESS>GPO Box 787</ADDRESS>

    <CITY>Canberra</CITY>

    <STATE>ACT</STATE>

    <COUNTRY>Australia</COUNTRY>

    <POSTAL>2620</POSTAL>

    <CNTVOICE>+61 6 274 1203</CNTVOICE>

    <CNTFAX>+61 6 274 1333</CNTFAX>

    <CNTEMAIL>shane@erin.gov.au</CNTEMAIL>
  </CNTINFO>

  <METD>
    <DAY>21</DAY>
    <MONTH>Nov</MONTH>
    <YEAR>1996</YEAR>
  </METD>

  <SUPPLINF>
    <PARAGRPH>
      Documentation of the Normalized Difference Vegetation Index 
      (AVHRR) AVHRR data can be found at:-
      <LIST>
        <ITEM>
          Folio held by Neil Freeman (ERIN)
        </ITEM>

        <ITEM>
          On-line documentation -   
          http://www.environment.gov.au/land/monitoring/ndvi.html
        </ITEM>
      </LIST>
    </PARAGRPH>
  </SUPPLINF>
</ANZMETA>