A comprehensive ETL pipeline for Brazilian public data analysis and business insights
- ETL Pipeline: Complete Extract, Transform, Load workflow
- 🇧🇷 Brazilian Data: Specialized for Brazilian public datasets
- IBGE Integration: Direct integration with Brazilian census data
- SICONV Support: Government funding and transfer data
- Async Processing: High-performance data processing
- Well Tested: Comprehensive test suite with pytest
- Business Intelligence: Ready-to-use insights and analytics
src/public_data_pipeline/
├── extractors/ # Data extraction modules
│ ├── ibge_extractor.py # IBGE API integration
│ └── siconv_extractor.py # SICONV data extraction
├── transformers/ # Data transformation
│ ├── cleaner.py # Data cleaning utilities
│ └── normalizer.py # Data normalization
└── loaders/ # Data loading and export
├── csv_loader.py # CSV export functionality
└── database_loader.py # Database integration
- Population Census: Demographic data by municipality
- Economic Surveys: GDP, employment, income statistics
- Geographic Data: Administrative boundaries and territories
- Federal Transfers: Government funding data
- Municipal Projects: Public investment tracking
- Budget Analysis: Government spending insights
Create a .env
file for configuration:
API Configuration
IBGE_API_BASE_URL=https://servicodados.ibge.gov.br/api/v1 SICONV_API_BASE_URL=https://api.siconv.gov.br
from public_data_pipeline.extractors import IBGEExtractor
Initialize extractor
extractor = IBGEExtractor()
Extract population data
population_data = extractor.get_population_data(year=2020)
print(f"Extracted {len(population_data)} records")
Clone repository git clone https://github.com/bellDataSc/Public-Data-Pipeline-for-Business-Insights.git cd Public-Data-Pipeline-for-Business-Insights
Create virtual environment python -m venv venv venv\Scripts\activate # Windows
source venv/bin/activate # macOS/Linux Install for development
pip install -e .
pip install -r requirements-dev.txt
Run tests pytest -v
We welcome contributions! Please see our Contributing Guidelines for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Make your changes
- Add tests for your changes
- Ensure tests pass (
pytest -v
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
- Follow PEP 8 style guidelines
- Add type hints to all functions
- Write comprehensive docstrings
- Maintain >90% test coverage
- Use conventional commit messages
This project is licensed under the MIT License - see the LICENSE file for details.
- IBGE for providing comprehensive Brazilian statistical data
- Brazilian Government for open data initiatives
- Python Community for excellent data science tools
Bel - Data Engineer & Analyst
- GitHub: @bellDataSc
- LinkedIn: Connect with Bel
- Email: [email protected]