Professional Python project implementing Data Analysis Practice
Data Analysis Practice is a production-grade Python application complemented by R that showcases modern software engineering practices including clean architecture, comprehensive testing, containerized deployment, and CI/CD readiness.
The codebase comprises 375 lines of source code organized across 3 modules, following industry best practices for maintainability, scalability, and code quality.
- 🔄 Data Pipeline: Scalable ETL with parallel processing
- ✅ Data Validation: Schema validation and quality checks
- 📊 Monitoring: Pipeline health metrics and alerting
- 🔧 Configurability: YAML/JSON-based pipeline configuration
graph TB
subgraph Core["🏗️ Core"]
A[Main Module]
B[Business Logic]
C[Data Processing]
end
subgraph Support["🔧 Support"]
D[Configuration]
E[Utilities]
F[Tests]
end
A --> B --> C
D --> A
E --> B
F -.-> B
style Core fill:#e1f5fe
style Support fill:#f3e5f5
- Python 3.12+
- pip (Python package manager)
# Clone the repository
git clone https://github.com/galafis/Data-analysis-practice.git
cd Data-analysis-practice
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# Run the application
python src/main.pyData-analysis-practice/
├── tests/ # Test suite
│ ├── __init__.py
│ └── test_main.py
├── LICENSE
├── README.md
├── analyze_association_fixed.py
├── college_major_analysis.R
└── create_college_dataset.py
| Technology | Description | Role |
|---|---|---|
| Python | Core Language | Primary |
| R | 1 files | Supporting |
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the project
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Gabriel Demetrios Lafis
- GitHub: @galafis
- LinkedIn: Gabriel Demetrios Lafis
Data Analysis Practice é uma aplicação Python de nível profissional, complementada por R que demonstra práticas modernas de engenharia de software, incluindo arquitetura limpa, testes abrangentes, implantação containerizada e prontidão para CI/CD.
A base de código compreende 375 linhas de código-fonte organizadas em 3 módulos, seguindo as melhores práticas do setor para manutenibilidade, escalabilidade e qualidade de código.
- 🔄 Data Pipeline: Scalable ETL with parallel processing
- ✅ Data Validation: Schema validation and quality checks
- 📊 Monitoring: Pipeline health metrics and alerting
- 🔧 Configurability: YAML/JSON-based pipeline configuration
graph TB
subgraph Core["🏗️ Core"]
A[Main Module]
B[Business Logic]
C[Data Processing]
end
subgraph Support["🔧 Support"]
D[Configuration]
E[Utilities]
F[Tests]
end
A --> B --> C
D --> A
E --> B
F -.-> B
style Core fill:#e1f5fe
style Support fill:#f3e5f5
- Python 3.12+
- pip (Python package manager)
# Clone the repository
git clone https://github.com/galafis/Data-analysis-practice.git
cd Data-analysis-practice
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# Run the application
python src/main.pyData-analysis-practice/
├── tests/ # Test suite
│ ├── __init__.py
│ └── test_main.py
├── LICENSE
├── README.md
├── analyze_association_fixed.py
├── college_major_analysis.R
└── create_college_dataset.py
| Tecnologia | Descrição | Papel |
|---|---|---|
| Python | Core Language | Primary |
| R | 1 files | Supporting |
Contribuições são bem-vindas! Sinta-se à vontade para enviar um Pull Request.
Este projeto está licenciado sob a Licença MIT - veja o arquivo LICENSE para detalhes.
Gabriel Demetrios Lafis
- GitHub: @galafis
- LinkedIn: Gabriel Demetrios Lafis