History of the `galah-python` package

Overview

Teaching: 10 min
Exercises: 0 min
Questions
  • What is the ALA?

  • What is the history of the galah package?

  • Why is there an extension to Python?

Objectives
  • Understand what the ALA is and its use.

  • Know the history of the galah software package

  • Know why it has been extended to the Python programming language

What is the ALA?

The Atlas of Living Australia (ALA) provides everyone, from researchers to citizen scientists, open access to Australia’s bioversity data. This has been used in everything, from scientific research papers on photographs as an essential biodiversity resource to information on how citizen science aids prediction of habitat suitability.

What is the history of the galah package?

galah was not always named galah. In fact, it started out as a package titled ala4r, which was the first attempt at writing and releasing a package that directly downloaded ALA data into R. However, it had some problems:

These took the form of:

These function then returned one of three things:

Using the tidyverse, the ala4r package was then renamed and rearchitectured into galah, making it more user friendly and able to query the ALA, as well as other national GBIF atlases. This is reflected by the following functions:

Lookup Narrow a query Run a query
show_all() galah_identify() atlas_counts()
search_all() galah_filter() atlas_occurrences()
  galah_select() atlas_species()
  galah_group_by() atlas_media()
  galah_geolocate()  

Why is there an extension to Python?

The Python programming langauge is one of the most widely used programming languages in the world. According to the Institude of Electrical and Electronics Engineers (IEEE) and PopularitY of Programming Language (PYPL), Python consistently ranks #1 in popularity and demand for employers, and is a great general-purpose language. It shines when it is put to use in data analysis, data visualisation, and machine learning.

To ensure that the R and Python packages are as similar as possible, we structured the Python package to still be Pythonic, but behave in a similar fashion to the R program. This is reflected in the list:

Lookup Narrow a query Run a query
galah.show_all() taxa= galah.atlas_counts()
galah.search_all() filters= galah.atlas_occurrences()
  select= galah.atlas_species()
  group_by= galah.atlas_media()
  polygon=  
  bbox=  

Key Points

  • Data providers from all over Australia share data with the ALA, including citizen scientists, governments, museums and other collections.

  • Getting data from the ALA via a programming language has been facilitated with the galah package

  • Galah has been extended to Python to increase the user base of the ALA data