A Python tool for automating the lookup and retrieval of media catalog IDs from the Cornell Lab of Ornithology's Macaulay Library based on species lists and search criteria.
This tool helps researchers and bird enthusiasts automate the process of finding audio and visual media from the Macaulay Library by:
- Taking species lists as input (manual list or from eBird API)
- Looking up species in the eBird taxonomy
- Searching the Macaulay Library using customizable filters
- Extracting catalog IDs and metadata from multiple pages of results (pagination support)
- Exporting results to CSV format
The tool automatically handles pagination to retrieve all available results beyond the first page:
- Fetches multiple pages of results until the requested
max_resultsis reached - Intelligently detects when no more results are available
- Includes rate limiting between page requests to be respectful to servers
- Supports multiple pagination parameter patterns used by the Macaulay Library
- Python 3.8 or higher
- Internet connection (for API access)
pip install macaulay-library-lookupgit clone https://github.com/weecology/MacaulayLibraryLookup.git
cd MacaulayLibraryLookup
pip install -e .# Search for American Robin recordings in New York during May
macaulay-lookup --species "American Robin" --region "US-NY" --month 5 --media-type audio --tag song
# Use a species list file
macaulay-lookup --species-file species_list.txt --region "US-CA" --output results.csv
# Get species from eBird hotspot
macaulay-lookup --ebird-hotspot "L12345" --month 4,5,6 --media-type photofrom macaulay_library_lookup import MacaulayLookup
# Initialize the lookup tool
ml = MacaulayLookup()
# Search for a single species
results = ml.search_species(
common_name="American Robin",
region="US-NY",
month=5,
media_type="audio",
tag="song"
)
# Search multiple species
species_list = ["American Robin", "Blue Jay", "Cardinal"]
results = ml.search_multiple_species(
species_list,
region="US-FL",
begin_month=3,
end_month=5
)
# Get species from eBird
results = ml.search_from_ebird_hotspot(
hotspot_id="L12345",
days_back=30,
region="US-CA"
)
# Export to CSV
ml.export_to_csv(results, "macaulay_results.csv")The tool generates CSV files with the following columns:
catalog_id: Macaulay Library catalog IDspecies_code: eBird species codecommon_name: Species common namescientific_name: Species scientific namemedia_type: Type of media (audio, photo, video)region: Geographic region codelocation: Recording locationdate: Recording daterecordist: Name of recordisturl: Direct URL to the mediasearch_month: Month(s) used in searchsearch_tag: Tag used in search (if any)
# Advanced filtering
results = ml.search_species(
common_name="Wood Thrush",
region="US-NY",
begin_month=4,
end_month=8,
media_type="audio",
tag="song",
quality="A", # High quality recordings only
background="0", # No background noise
recordist="Jane Doe" # Specific recordist
)# Process multiple regions
regions = ["US-NY", "US-CT", "US-MA"]
species = ["Wood Thrush", "Hermit Thrush"]
all_results = []
for region in regions:
results = ml.search_multiple_species(
species,
region=region,
begin_month=5,
end_month=7
)
all_results.extend(results)
ml.export_to_csv(all_results, "northeast_thrushes.csv")Some features require eBird API access:
- Get an API key from eBird API
- Set it as an environment variable:
export EBIRD_API_KEY="your_api_key_here"
- Or pass it directly:
ml = MacaulayLookup(ebird_api_key="your_api_key")
MacaulayLibraryLookup/
βββ macaulay_library_lookup/
β βββ __init__.py
β βββ core.py # Main lookup functionality
β βββ ebird_api.py # eBird API integration
β βββ taxonomy.py # eBird taxonomy handling
β βββ parsers.py # HTML parsing utilities
β βββ cli.py # Command line interface
βββ tests/
β βββ test_core.py
β βββ test_ebird_api.py
β βββ test_taxonomy.py
βββ examples/
β βββ basic_usage.py
β βββ batch_processing.py
β βββ species_lists/
βββ docs/
βββ .github/
β βββ workflows/
β βββ tests.yml
β βββ publish.yml
βββ setup.py
βββ requirements.txt
βββ README.md
βββ LICENSE
Run the test suite:
# Install development dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/
# Run with coverage
pytest tests/ --cov=macaulay_library_lookup- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Cornell Lab of Ornithology for the Macaulay Library
- eBird for taxonomy and species data
- The open source community
If you encounter any issues or have feature requests, please open an issue on GitHub.
See CHANGELOG.md for version history and updates.
The examples/advanced_ebird_example.py script demonstrates how to:
- Query the eBird API for species at a specific hotspot (Cajas National Park)
- Get media for each species using the Macaulay Library API
- Save detailed results to CSV including:
- Species information (code, common name, scientific name)
- Observation details (location, date, coordinates)
- Media catalog IDs
Example output showing species observations and media records:
The script successfully retrieves media for many species:
To run the example:
export EBIRD_API_KEY='your-api-key-here'
python examples/advanced_ebird_example.py
