Skip to content

weecology/MacaulayLibraryLookup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

MacaulayLibraryLookup

A Python tool for automating the lookup and retrieval of media catalog IDs from the Cornell Lab of Ornithology's Macaulay Library based on species lists and search criteria.

🎯 Overview

This tool helps researchers and bird enthusiasts automate the process of finding audio and visual media from the Macaulay Library by:

  1. Taking species lists as input (manual list or from eBird API)
  2. Looking up species in the eBird taxonomy
  3. Searching the Macaulay Library using customizable filters
  4. Extracting catalog IDs and metadata from multiple pages of results (pagination support)
  5. Exporting results to CSV format

πŸ”„ Pagination Support

The tool automatically handles pagination to retrieve all available results beyond the first page:

  • Fetches multiple pages of results until the requested max_results is reached
  • Intelligently detects when no more results are available
  • Includes rate limiting between page requests to be respectful to servers
  • Supports multiple pagination parameter patterns used by the Macaulay Library

πŸ”§ Installation

Prerequisites

  • Python 3.8 or higher
  • Internet connection (for API access)

Install from PyPI (once published)

pip install macaulay-library-lookup

Install from Source

git clone https://github.com/weecology/MacaulayLibraryLookup.git
cd MacaulayLibraryLookup
pip install -e .

πŸš€ Quick Start

Command Line Interface

# Search for American Robin recordings in New York during May
macaulay-lookup --species "American Robin" --region "US-NY" --month 5 --media-type audio --tag song

# Use a species list file
macaulay-lookup --species-file species_list.txt --region "US-CA" --output results.csv

# Get species from eBird hotspot
macaulay-lookup --ebird-hotspot "L12345" --month 4,5,6 --media-type photo

Python API

from macaulay_library_lookup import MacaulayLookup

# Initialize the lookup tool
ml = MacaulayLookup()

# Search for a single species
results = ml.search_species(
    common_name="American Robin",
    region="US-NY",
    month=5,
    media_type="audio",
    tag="song"
)

# Search multiple species
species_list = ["American Robin", "Blue Jay", "Cardinal"]
results = ml.search_multiple_species(
    species_list,
    region="US-FL",
    begin_month=3,
    end_month=5
)

# Get species from eBird
results = ml.search_from_ebird_hotspot(
    hotspot_id="L12345",
    days_back=30,
    region="US-CA"
)

# Export to CSV
ml.export_to_csv(results, "macaulay_results.csv")

πŸ“Š Output Format

The tool generates CSV files with the following columns:

  • catalog_id: Macaulay Library catalog ID
  • species_code: eBird species code
  • common_name: Species common name
  • scientific_name: Species scientific name
  • media_type: Type of media (audio, photo, video)
  • region: Geographic region code
  • location: Recording location
  • date: Recording date
  • recordist: Name of recordist
  • url: Direct URL to the media
  • search_month: Month(s) used in search
  • search_tag: Tag used in search (if any)

πŸ› οΈ Advanced Usage

Filtering Options

# Advanced filtering
results = ml.search_species(
    common_name="Wood Thrush",
    region="US-NY",
    begin_month=4,
    end_month=8,
    media_type="audio",
    tag="song",
    quality="A",  # High quality recordings only
    background="0",  # No background noise
    recordist="Jane Doe"  # Specific recordist
)

Batch Processing

# Process multiple regions
regions = ["US-NY", "US-CT", "US-MA"]
species = ["Wood Thrush", "Hermit Thrush"]

all_results = []
for region in regions:
    results = ml.search_multiple_species(
        species,
        region=region,
        begin_month=5,
        end_month=7
    )
    all_results.extend(results)

ml.export_to_csv(all_results, "northeast_thrushes.csv")

πŸ”‘ API Keys

Some features require eBird API access:

  1. Get an API key from eBird API
  2. Set it as an environment variable:
    export EBIRD_API_KEY="your_api_key_here"
  3. Or pass it directly:
    ml = MacaulayLookup(ebird_api_key="your_api_key")

πŸ“ Project Structure

MacaulayLibraryLookup/
β”œβ”€β”€ macaulay_library_lookup/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ core.py              # Main lookup functionality
β”‚   β”œβ”€β”€ ebird_api.py         # eBird API integration
β”‚   β”œβ”€β”€ taxonomy.py          # eBird taxonomy handling
β”‚   β”œβ”€β”€ parsers.py           # HTML parsing utilities
β”‚   └── cli.py               # Command line interface
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ test_core.py
β”‚   β”œβ”€β”€ test_ebird_api.py
β”‚   └── test_taxonomy.py
β”œβ”€β”€ examples/
β”‚   β”œβ”€β”€ basic_usage.py
β”‚   β”œβ”€β”€ batch_processing.py
β”‚   └── species_lists/
β”œβ”€β”€ docs/
β”œβ”€β”€ .github/
β”‚   └── workflows/
β”‚       β”œβ”€β”€ tests.yml
β”‚       └── publish.yml
β”œβ”€β”€ setup.py
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ README.md
└── LICENSE

πŸ§ͺ Testing

Run the test suite:

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/

# Run with coverage
pytest tests/ --cov=macaulay_library_lookup

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Cornell Lab of Ornithology for the Macaulay Library
  • eBird for taxonomy and species data
  • The open source community

πŸ› Issues

If you encounter any issues or have feature requests, please open an issue on GitHub.

πŸ“ˆ Changelog

See CHANGELOG.md for version history and updates.

Advanced eBird API Example

The examples/advanced_ebird_example.py script demonstrates how to:

  1. Query the eBird API for species at a specific hotspot (Cajas National Park)
  2. Get media for each species using the Macaulay Library API
  3. Save detailed results to CSV including:
    • Species information (code, common name, scientific name)
    • Observation details (location, date, coordinates)
    • Media catalog IDs

Example output showing species observations and media records:

Example CSV Output

The script successfully retrieves media for many species:

Media Results

To run the example:

export EBIRD_API_KEY='your-api-key-here'
python examples/advanced_ebird_example.py

About

A python tool to look up catalog numbers in Cornell Macaulay Library

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages