Skip to content
View zakir0101's full-sized avatar

Block or report zakir0101

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
zakir0101/README.md

Zakir - Full Stack Developer & AI/OCR Specialist

πŸ‘‹ About Me

I'm Zakir, a full stack developer specializing in document intelligence, OCR (Optical Character Recognition), PDF processing, and AI integration. I build production-ready systems for educational technology and document automation, with expertise in synthetic data generation, multi-backend OCR pipelines, and custom PDF parsing engines.

Core Expertise:

  • Document Intelligence: OCR systems, PDF processing, synthetic data generation
  • AI Integration: Model fine-tuning (LoRA, vLLM), Gemini API, DeepSeek OCR
  • Full Stack Development: Python/Flask backends, React frontends, Android apps
  • Production Systems: GPU orchestration, Docker, scalable architecture

πŸš€ Featured Projects

1. Synthetic Exam Generator

Technologies: Python, React, DeepSeek OCR, Mathpix, Gemini API, Hugging Face Hub Sophisticated CLI system for parsing OCR results from exam papers and generating synthetic training data with ground-truth bounding boxes. Multi-backend OCR pipeline with intelligent routing.

2. Multi-Backend OCR System

Technologies: Python, Flask, React, Shell Scripting, GPU Orchestration Production-ready OCR system running DeepSeek-OCR and MinerU simultaneously on dedicated GPUs. Features intelligent backend selection, parallel processing, and real-time performance monitoring.

3. Custom PDF Parsing Engine

Technologies: Python, Cairo Graphics, FreeType, Tkinter, Regex Tokenization PDF processing engine built from scratch with custom regex tokenizer and graphics state machine. Supports Type1/TrueType/Type0 fonts, hierarchical document structure analysis, and question detection systems.

4. AI-Powered Exam Generator

Technologies: React, TypeScript, Tailwind CSS, Gemini API, PDF.js, JSZip Client-side React application for generating exam questions from study materials using Gemini API. Features PDF upload, intelligent question generation, and export functionality.

5. Fine-Tuning DeepSeek OCR

Technologies: Python, LoRA, vLLM, Hugging Face Transformers, Shell Scripting Fine-tuning DeepSeek's OCR models using LoRA for efficient parameter optimization. Includes training pipelines, evaluation scripts, and model serving with vLLM.

6. Task & Consequence Android App

Technologies: Java, Android SDK, SQLite, MVVM, Material Design 3 Android productivity app for managing tasks within structured programs using punishment system. Features goal tracking, habit formation, and progress analytics.

πŸ”§ Technical Skills

Frontend Development

  • JavaScript (90%), TypeScript (85%), Vue.js (80%), HTML5 (95%), CSS3 (92%), React (75%)

Backend Development

  • Python (95%), Flask (82%), Java (85%), Node.js (78%), REST APIs (90%), SQL (85%)

Mobile Development

  • Android Studio (80%), Java (85%), Kotlin (75%), React Native (70%)

AI & OCR Technologies

  • DeepSeek OCR (85%), Mathpix, Gemini API, LoRA fine-tuning, vLLM serving
  • Synthetic data generation, ground-truth bounding boxes, data augmentation
  • Hugging Face Hub integration, model deployment, GPU acceleration

Tools & DevOps

  • Git (92%), Neovim (85%), VS Code (90%), Docker (75%), Shell scripting
  • GPU orchestration, CI/CD pipelines, performance optimization

πŸ“Š Document Intelligence Capabilities

OCR & Text Recognition

  • Advanced optical character recognition with multiple backends (DeepSeek OCR, Mathpix, Gemini)
  • GPU acceleration and intelligent routing
  • Handwritten text recognition and mathematical equation parsing
  • DeepSeek OCR fine-tuning and multi-backend orchestration

PDF Processing Engine

  • Custom PDF parsing and rendering built from scratch
  • Cairo graphics and FreeType font rendering (Type1/TrueType/Type0)
  • Custom regex tokenization and document structure analysis
  • Question detection systems and hierarchical parsing

Synthetic Data Generation

  • Generating synthetic training data for OCR models
  • Ground-truth bounding boxes mimicking real-world exam papers
  • Data augmentation pipelines and Hugging Face Hub integration
  • Synthetic exam generation for model training

AI Integration & Model Fine-tuning

  • Fine-tuning vision-language models using LoRA for parameter optimization
  • vLLM serving and Gemini API integration
  • GPU acceleration and real-time processing
  • Custom model development and deployment

🎯 Real-World Applications

Educational Technology

  • Exam paper processing and automated question generation
  • Tutoring platforms and learning management systems
  • Student assessment and progress tracking

Document Automation

  • Intelligent document processing and form extraction
  • Contract analysis and archival digitization
  • Automated data entry and validation

Research & Development

  • OCR model development and synthetic data research
  • Document intelligence algorithms and open-source tools
  • AI integration research and implementation

🌐 Portfolio Website

Live Demonstration

This portfolio website is deployed on GitHub Pages and showcases my projects, skills, and expertise: πŸ”— Live Site: https://zakir0101.github.io/zakir0101/

Website Features

  • Document Intelligence Expertise Section: Dedicated showcase of OCR, PDF processing, and AI integration capabilities
  • Interactive Project Cards: Real GitHub API integration showing live repository data
  • AI Chat Assistant: Voice-enabled chatbot with pre-programmed responses about my projects
  • Skills Visualization: Animated progress bars for technical proficiencies
  • Responsive Design: Mobile-first approach with dark/light theme toggle

Technical Implementation

  • Frontend: HTML5, CSS3 with CSS variables, JavaScript ES6+
  • Interactivity: GitHub REST API integration, Web Speech API for voice recognition
  • Performance: Optimized animations, lazy loading, efficient rendering
  • Theming: CSS variables for dark/light modes with system preference detection

πŸ“« Connect With Me

πŸ“ˆ Continuous Learning

I'm continuously advancing my skills in:

  • Advanced OCR model fine-tuning and optimization
  • Scalable document processing pipelines
  • AI integration for educational technology
  • Production system architecture and DevOps

"Building intelligent document processing systems and AI-powered applications for the future of education and automation."

Last Updated: December 2025

Pinned Loading

  1. python-pdf-parser-renderer python-pdf-parser-renderer Public

    Python 1

  2. synthetic-exam-gerator synthetic-exam-gerator Public

    Python

  3. zakir0101 zakir0101 Public

    Config files for my GitHub profile.

    JavaScript