Skip to content
View rohit-ashva900's full-sized avatar
πŸ‘‹
πŸ‘‹

Block or report rohit-ashva900

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
rohit-ashva900/README.md

Rohit Ashva

πŸš€ Data Engineer | ETL/ELT & Data Extraction | Automation & Bot Development | FastAPI

Welcome! I’m a Cloud Data Engineer specializing in Python automation, web scraping, and high-volume data extraction. I build production-ready pipelines, interactive bots, and user interfaces to streamline data workflows.


πŸ› οΈ Skills & Technologies

  • Languages & Frameworks: Python 🐍, SQL πŸ—„οΈ, FastAPI 🌐, Streamlit ⚑
  • Data Engineering: ETL/ELT workflows, pandas 🐼
  • Web Scraping & Automation: Playwright 🎭, Selenium πŸ•·οΈ, Scrapy 🧱, BeautifulSoup 🍜, Requests πŸ”—
  • Anti-Bot & CAPTCHA Handling: 2Captcha, custom scripts, dynamic challenge bypass
  • Bot Development & GUI Applications: Interactive bots and lightweight UIs using Streamlit and Flask
  • API Integration: REST APIs, OpenAI, DeepSeek, DeepInfra, OAuth2
  • PDF & Document Parsing: PyMuPDF πŸ“„, pdfplumber, OpenAI Vision
  • Orchestration & Scheduling: Apache Airflow 🌬️, GitHub Actions ⏱️, cron πŸ•‘, bash scripts
  • Cloud Platforms:
    • AWS: S3 πŸͺ£, EC2 πŸ–₯️
    • Azure: Blob Storage ☁️
    • GCP: BigQuery
  • Output Formats: Excel, CSV, Parquet, JSON, Google Sheets

πŸ’Ό Services I Offer

  • πŸ”„ ETL/ELT Pipeline Development – End-to-end data workflows, scheduling & monitoring
  • πŸ€– Bot Development & GUI Applications – Build interactive bots and user-friendly interfaces for data extraction with Streamlit or Flask
  • πŸ” Web Scraping & Data Extraction – Large-scale, reliable extraction from websites, APIs & documents
  • πŸ›‘οΈ Advanced CAPTCHA & Anti-Bot Solutions – Bypass dynamic CAPTCHA and anti-bot measures for consistent data access
  • πŸ”— API Integration – Connect & automate data from any REST endpoint
  • βš™οΈ Custom API Development – Deploy robust data services with Flask
  • πŸ“„ PDF & Text Extraction – Structured parsing from complex or multilingual files
  • ☁️ Cloud Automation – Orchestrate workflows with Airflow, Data Factory or GitHub Actions

πŸ“‚ Featured Projects

  • πŸ“„ U.S. Legislative Data Pipeline (2024–2025)
    Scraped 30+ state legislature sites (web & PDF), scheduled via Airflow, outputs to Excel.

  • πŸ€– AI-Powered Bots & Interfaces
    Developed custom automation bots and Streamlit dashboards for end-user data insights.

  • πŸ“Š Real-Time Data Connectors
    Built scheduled connectors for product, real estate & news APIsβ€”transforming raw feeds for ML.

  • πŸ›’ E-commerce Scrapers
    Extracted Amazon, Walmart & vendor data at scale, delivering clean datasets for BI.


Let’s connect if you need robust, production-ready data engineering, automation, and interactive solutions for your AI, analytics, or business projects!

Pinned Loading

  1. Dynamic_ETL_Pipeline_Project_with_Azure Dynamic_ETL_Pipeline_Project_with_Azure Public

    Python

  2. etl_titanic etl_titanic Public

    Jupyter Notebook

  3. ZomatoBangaloreRestaurants ZomatoBangaloreRestaurants Public

    Jupyter Notebook

  4. weather_data_airflow_etl weather_data_airflow_etl Public

    Python

  5. analyse-e-commerce-data-with-power-BI analyse-e-commerce-data-with-power-BI Public

  6. tokyo-olympic-azure-data-engineering-project tokyo-olympic-azure-data-engineering-project Public

    Forked from darshilparmar/tokyo-olympic-azure-data-engineering-project

    tokyo-olympic-azure-data-engineering-project

    Jupyter Notebook