Welcome! Iβm a Cloud Data Engineer specializing in Python automation, web scraping, and high-volume data extraction. I build production-ready pipelines, interactive bots, and user interfaces to streamline data workflows.
- Languages & Frameworks: Python π, SQL ποΈ, FastAPI π, Streamlit β‘
- Data Engineering: ETL/ELT workflows, pandas πΌ
- Web Scraping & Automation: Playwright π, Selenium π·οΈ, Scrapy π§±, BeautifulSoup π, Requests π
- Anti-Bot & CAPTCHA Handling: 2Captcha, custom scripts, dynamic challenge bypass
- Bot Development & GUI Applications: Interactive bots and lightweight UIs using Streamlit and Flask
- API Integration: REST APIs, OpenAI, DeepSeek, DeepInfra, OAuth2
- PDF & Document Parsing: PyMuPDF π, pdfplumber, OpenAI Vision
- Orchestration & Scheduling: Apache Airflow π¬οΈ, GitHub Actions β±οΈ, cron π, bash scripts
- Cloud Platforms:
- AWS: S3 πͺ£, EC2 π₯οΈ
- Azure: Blob Storage βοΈ
- GCP: BigQuery
- Output Formats: Excel, CSV, Parquet, JSON, Google Sheets
- π ETL/ELT Pipeline Development β End-to-end data workflows, scheduling & monitoring
- π€ Bot Development & GUI Applications β Build interactive bots and user-friendly interfaces for data extraction with Streamlit or Flask
- π Web Scraping & Data Extraction β Large-scale, reliable extraction from websites, APIs & documents
- π‘οΈ Advanced CAPTCHA & Anti-Bot Solutions β Bypass dynamic CAPTCHA and anti-bot measures for consistent data access
- π API Integration β Connect & automate data from any REST endpoint
- βοΈ Custom API Development β Deploy robust data services with Flask
- π PDF & Text Extraction β Structured parsing from complex or multilingual files
- βοΈ Cloud Automation β Orchestrate workflows with Airflow, Data Factory or GitHub Actions
-
π U.S. Legislative Data Pipeline (2024β2025)
Scraped 30+ state legislature sites (web & PDF), scheduled via Airflow, outputs to Excel. -
π€ AI-Powered Bots & Interfaces
Developed custom automation bots and Streamlit dashboards for end-user data insights. -
π Real-Time Data Connectors
Built scheduled connectors for product, real estate & news APIsβtransforming raw feeds for ML. -
π E-commerce Scrapers
Extracted Amazon, Walmart & vendor data at scale, delivering clean datasets for BI.
Letβs connect if you need robust, production-ready data engineering, automation, and interactive solutions for your AI, analytics, or business projects!