Distribution data
=================
Distribution data comes in different forms: regions where a species is known to
occur, or point data, either as latitude/longitude or named places. At present,
taxonome is only able to work with regions data, but this is an area for
future development.
Use the :mod:`taxonome.regions` module for working with this data.
.. note:: The tools for manipulating distribution data are currently only in
Taxonome's programmable interface. To use them, you need to write some simple
Python code.
Included data
-------------
Taxonome comes with a set of regions covering the world's land surface, based on
the `TDWG World Geographic Scheme for Recording Plant Distributions
`_. TDWG divides the land up into successively
smaller regions at four levels, following political boundaries where possible to
make referencing easier. Each region has a short code. They can be accessed like
this::
from taxonome import regions
regions.tdwg["NET"] # Gets the region tuple for the Netherlands (level 3)
aus = regions.tdwg["50"] # The region for Australia (level 2)
for name, tdwg in regions.world.subregions(aus):
print(name)
Regions are also indexed by name, and by `ISO two-letter country codes
`_. The following all return
the same object::
regions.tdwg["SUR"]
regions.names["Suriname"]
regions.ISO["SR"]
The data also includes a few countries which do not have a directly corresponding
TDWG region. These are added so that their subregions are the TDWG regions making
up that country, and they are subregions of the relevant continent (except
Turkey, which spans Europe and Asia). So, although South Africa forms part of
TDWG region 27 (Southern Africa), you can use it like this::
za = regions.names["South Africa"]
for name,tdwg in regions.world.immediate_subregions(za):
print(name)
.. note::
Regions are stored as tuples of (name, tdwg_code). For regions not in TDWG,
tdwg_code is None.
Normalising region data
-----------------------
If your data uses different systems for listing regions, Taxonome offers functions
to match them to TDWG regions automatically. :func:`taxonome.regions.find_tdwg`
works for a single region, and :func:`taxonome.regions.tdwgise` for a group
of regions (such as the distribution of a species).
Both functions default to accepting a name and returning TDWG level 3 region
codes, but these can be customised using keyword parameters::
from taxonome.regions import find_tdwg, tdwgise, ISO
find_tdwg("China")
# Gives the 8 level 3 regions in China:
# CHC, CHH, CHI, CHM, CHN, CHQ, CHS, CHT, CHX
find_tdwg("Togo", level=2)
# '22' (West Tropical Africa)
tdwg_codes, notfound = tdwgise(["TJ", "KG", "TM", "UZ"], index=ISO)
# tdwg_codes is {'TZK', 'KGZ', 'TKM', 'UZB'}
# notfound will contain any names/codes that couldn't be matched. In this
# case, there are none.
Creating your own regions
-------------------------
The TDWG data is held in a simple, pure Python implementation of a directed
graph. To make your own regions data, you can instantiate the class
:class:`taxonome.regions.Map`, and call its :meth:`add_region` method. Regions
can be any hashable objects, such as strings or tuples.
If you've created a set of regions that you think would be useful to other
people (e.g. for the world's oceans), please get in touch with me to discuss
including it in Taxonome.