Skip to content
Change the repository type filter

All

    Repositories list

    • This repository contains the implementation of AutoSchemaKG, a novel framework for automatic knowledge graph construction that combines schema generation via conceptualization.
      Python
      8365540Updated Jan 5, 2026Jan 5, 2026
    • NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents
      Python
      1913210Updated Dec 15, 2025Dec 15, 2025
    • AtlasKV

      Public
      AtlasKV: A scalable, effective, and general way to augment LLMs with billion-scale knowledge graphs using very little GPU memory cost.
      Python
      31210Updated Dec 1, 2025Dec 1, 2025
    • MarConf

      Public
      [ACL 2025] Revisiting Epistemic Markers in Confidence Estimation: Can Markers Accurately Reflect Large Language Models' Uncertainty?.
      Python
      1801Updated Nov 25, 2025Nov 25, 2025
    • Python
      12300Updated Nov 17, 2025Nov 17, 2025
    • privacy

      Public
      HTML
      0200Updated Nov 17, 2025Nov 17, 2025
    • CritiCal

      Public
      Code for CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?
      Python
      0610Updated Nov 15, 2025Nov 15, 2025
    • MARS

      Public
      Code and dataset for the paper: MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset (https://arxiv.org/pdf/2406.02106).
      Python
      0600Updated Nov 10, 2025Nov 10, 2025
    • [EMNLP2025] From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery
      3628600Updated Nov 5, 2025Nov 5, 2025
    • DARK

      Public
      Code for DARK: Unifying Deductive and Abductive Reasoning in Knowledge Graphs with Masked Diffusion Model
      Python
      0310Updated Oct 13, 2025Oct 13, 2025
    • [ACL 2024] Implementation for Advancing Abductive Reasoning in Knowledge Graphs through Complex Logical Hypothesis Generation
      Python
      11500Updated Oct 9, 2025Oct 9, 2025
    • [EMNLP 2025 Wordplay] LLM-Hanabi Evaluating Multi-Agent Gameplays with Theory-of-Mind and Rationale Inference in Imperfect Information Collaboration Game
      Python
      0200Updated Oct 4, 2025Oct 4, 2025
    • Official Repository for MASLegalBench.
      Python
      0000Updated Sep 30, 2025Sep 30, 2025
    • MCIP

      Public
      Python
      21010Updated Sep 29, 2025Sep 29, 2025
    • InteGround

      Public
      Python
      0000Updated Sep 20, 2025Sep 20, 2025
    • Official Repository for Context Reasoner.
      Python
      0700Updated Sep 1, 2025Sep 1, 2025
    • MarPT

      Public
      Code for Prospect Theory Fails for LLMs: Instability of Decision-Making under Epistemic Uncertainty
      Python
      0110Updated Aug 11, 2025Aug 11, 2025
    • CEQA

      Public
      Official Implementation of paper: Complex Query Answering on Eventuality Knowledge Graph with Implicit Logical Constraints
      Python
      11120Updated Jul 15, 2025Jul 15, 2025
    • FedNGDB

      Public
      Python
      1000Updated Jul 6, 2025Jul 6, 2025
    • TEGA

      Public
      [ACL 2025] Enhancing Transformers for Generalizable First-Order Logical Entailment
      Python
      0100Updated May 29, 2025May 29, 2025
    • ConKE

      Public
      0110Updated May 28, 2025May 28, 2025
    • Python
      0200Updated May 28, 2025May 28, 2025
    • Source code and data for paper "Patterns Over Principles: The Fragility of Inductive Reasoning in LLMs under Noisy Observations".
      Python
      0100Updated May 27, 2025May 27, 2025
    • CtrlHGen

      Public
      Python
      0200Updated May 27, 2025May 27, 2025
    • [ACL 2025] KnowShiftQA: How Robust are RAG Systems when Textbook Knowledge Shifts in K-12 Education?
      Jupyter Notebook
      0100Updated May 25, 2025May 25, 2025
    • Python
      0000Updated May 25, 2025May 25, 2025
    • The official repo of the paper "MMLongBench Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly"
      Python
      12000Updated May 16, 2025May 16, 2025
    • Jupyter Notebook
      42000Updated Apr 23, 2025Apr 23, 2025
    • [TMLR'25] The Curse of CoT: On the Limitations of Chain-of-Thought in In-Context Learning
      Python
      95300Updated Apr 14, 2025Apr 14, 2025
    • PipeNet

      Public
      Code for PipeNet: Question Answering with Semantic Pruning over Knowledge Graphs
      Python
      0100Updated Mar 27, 2025Mar 27, 2025