Distribution data

Distribution data comes in different forms: regions where a species is known to occur, or point data, either as latitude/longitude or named places. At present, taxonome is only able to work with regions data, but this is an area for future development.

Use the taxonome.regions module for working with this data.

Note

The tools for manipulating distribution data are currently only in Taxonome’s programmable interface. To use them, you need to write some simple Python code.

Included data

Taxonome comes with a set of regions covering the world’s land surface, based on the TDWG World Geographic Scheme for Recording Plant Distributions. TDWG divides the land up into successively smaller regions at four levels, following political boundaries where possible to make referencing easier. Each region has a short code. They can be accessed like this:

from taxonome import regions

regions.tdwg["NET"]     # Gets the region tuple for the Netherlands (level 3)

aus = regions.tdwg["50"]  # The region for Australia (level 2)
for name, tdwg in regions.world.subregions(aus):
    print(name)

Regions are also indexed by name, and by ISO two-letter country codes. The following all return the same object:

regions.tdwg["SUR"]
regions.names["Suriname"]
regions.ISO["SR"]

The data also includes a few countries which do not have a directly corresponding TDWG region. These are added so that their subregions are the TDWG regions making up that country, and they are subregions of the relevant continent (except Turkey, which spans Europe and Asia). So, although South Africa forms part of TDWG region 27 (Southern Africa), you can use it like this:

za = regions.names["South Africa"]
for name,tdwg in regions.world.immediate_subregions(za):
    print(name)

Note

Regions are stored as tuples of (name, tdwg_code). For regions not in TDWG, tdwg_code is None.

Normalising region data

If your data uses different systems for listing regions, Taxonome offers functions to match them to TDWG regions automatically. taxonome.regions.find_tdwg() works for a single region, and taxonome.regions.tdwgise() for a group of regions (such as the distribution of a species).

Both functions default to accepting a name and returning TDWG level 3 region codes, but these can be customised using keyword parameters:

from taxonome.regions import find_tdwg, tdwgise, ISO

find_tdwg("China")
# Gives the 8 level 3 regions in China:
# CHC, CHH, CHI, CHM, CHN, CHQ, CHS, CHT, CHX

find_tdwg("Togo", level=2)
# '22' (West Tropical Africa)

tdwg_codes, notfound = tdwgise(["TJ", "KG", "TM", "UZ"], index=ISO)
# tdwg_codes is {'TZK', 'KGZ', 'TKM', 'UZB'}
# notfound will contain any names/codes that couldn't be matched. In this
# case, there are none.

Creating your own regions

The TDWG data is held in a simple, pure Python implementation of a directed graph. To make your own regions data, you can instantiate the class taxonome.regions.Map, and call its add_region() method. Regions can be any hashable objects, such as strings or tuples.

If you’ve created a set of regions that you think would be useful to other people (e.g. for the world’s oceans), please get in touch with me to discuss including it in Taxonome.

Table Of Contents

Previous topic

Reading specific data sources

Next topic

Configuration

This Page