Skip to content
Change the repository type filter

All

    Repositories list

    • MinerU

      Public
      Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
      Python
      GNU Affero General Public License v3.0
      5k60k1595Updated Apr 16, 2026Apr 16, 2026
    • labelU

      Public
      Data annotation toolbox supports image, audio and video data.
      Python
      Apache License 2.0
      1711.5k410Updated Apr 16, 2026Apr 16, 2026
    • Python
      Apache License 2.0
      35310Updated Apr 16, 2026Apr 16, 2026
    • Agent-native knowledge engine with MCP tools for document indexing, wiki organization, fast retrieval and deep reading across PDF/DOCX/PPTX/Markdown
      TypeScript
      MIT License
      3332620Updated Apr 16, 2026Apr 16, 2026
    • mineru-vl-utils

      Public
      A Python package for interacting with the MinerU Vision-Language Model.
      Python
      Apache License 2.0
      3111300Updated Apr 15, 2026Apr 15, 2026
    • .github

      Public
      2100Updated Apr 14, 2026Apr 14, 2026
    • datasets resource
      1513640Updated Apr 14, 2026Apr 14, 2026
    • Vis3

      Public
      Data browser based on s3. 一个基于 S3 的数据(json / jsonl / parquet / html / md等)可视化工具。👇 Try online.
      TypeScript
      Apache License 2.0
      138400Updated Apr 14, 2026Apr 14, 2026
    • [CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation
      Python
      Apache License 2.0
      1701.7k1207Updated Apr 10, 2026Apr 10, 2026
    • WebMainBench is a high-precision benchmark for evaluating web main content extraction.
      Python
      Apache License 2.0
      101510Updated Apr 3, 2026Apr 3, 2026
    • Earth-Agent

      Public
      [ICLR 2026] The official implementation of the paper “Earth-Agent: Unlocking the Full Landscape of Earth Observation with Agents”
      Python
      MIT License
      16125100Updated Apr 2, 2026Apr 2, 2026
    • A diffusion-based framework for document OCR that replaces autoregressive decoding with block-level parallel diffusion decoding.
      Python
      MIT License
      3455241Updated Mar 31, 2026Mar 31, 2026
    • MinerU-HTML: An SLM-powered HTML main content extractor that outputs clean HTML bodies. Perfect for Deep Research Agents, RAG applications, and training data ge…
      Python
      Apache License 2.0
      2423410Updated Mar 27, 2026Mar 27, 2026
    • HTML
      Apache License 2.0
      1500Updated Mar 25, 2026Mar 25, 2026
    • VHM

      Public
      VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis
      Python
      Apache License 2.0
      811400Updated Mar 25, 2026Mar 25, 2026
    • Python
      Other
      12510Updated Mar 24, 2026Mar 24, 2026
    • Data annotation component library --provided as NPM packages
      TypeScript
      Apache License 2.0
      4714721Updated Mar 18, 2026Mar 18, 2026
    • LOKI

      Public
      [ICLR 2025 Spotlight] The official implementation of the paper “LOKI:A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models”
      Python
      417730Updated Feb 7, 2026Feb 7, 2026
    • TRivia

      Public
      TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition
      Python
      Apache License 2.0
      43030Updated Feb 5, 2026Feb 5, 2026
    • HTML
      0000Updated Feb 2, 2026Feb 2, 2026
    • Python
      912030Updated Jan 15, 2026Jan 15, 2026
    • rdkit

      Public
      A forked repo of the official RDKit library
      HTML
      BSD 3-Clause "New" or "Revised" License
      1k000Updated Jan 7, 2026Jan 7, 2026
    • OHR-Bench

      Public
      (ICCV 2025) OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
      Python
      139600Updated Dec 3, 2025Dec 3, 2025
    • 🕶️ A curated list of awesome things related to MinerU
      Python
      MIT License
      1710Updated Nov 14, 2025Nov 14, 2025
    • LEGION

      Public
      [ICCV25 Highlight] The official implementation of the paper "LEGION: Learning to Ground and Explain for Synthetic Image Detection"
      Python
      67790Updated Oct 22, 2025Oct 22, 2025
    • [ICCV 2025] The official implementation of the paper “Street-to-Satellite Image Synthesis with Diffusion Models and BEV Paradigm”
      Python
      Apache License 2.0
      68390Updated Oct 17, 2025Oct 17, 2025
    • UniMERNet

      Public
      UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition
      Python
      Apache License 2.0
      40466362Updated Sep 28, 2025Sep 28, 2025
    • FakeVLM

      Public
      [NeurIPS 2025 🔥] FakeVLM: Advancing Synthetic Image Detection through Explainable Multimodal Models and Fine-Grained Artifact Analysis
      Python
      912970Updated Sep 24, 2025Sep 24, 2025
    • [ACL 2025 Best Theme Paper] This is the official implementation for the paper: "Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language …
      Python
      1419200Updated Aug 29, 2025Aug 29, 2025
    • Python
      Apache License 2.0
      01110Updated Aug 20, 2025Aug 20, 2025
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.