Skip to content
Change the repository type filter

All

    Repositories list

    • MarPT

      Public
      Code for Prospect Theory Fails for LLMs: Instability of Decision-Making under Epistemic Uncertainty
      Python
      0410Updated Mar 24, 2026Mar 24, 2026
    • Source code and data for paper "Patterns Over Principles: The Fragility of Inductive Reasoning in LLMs under Noisy Observations".
      Python
      Apache License 2.0
      0100Updated Mar 19, 2026Mar 19, 2026
    • The official repository of the paper "Do Reasoning Models Enhance Embedding Models?"
      Python
      MIT License
      32800Updated Mar 5, 2026Mar 5, 2026
    • NAACL

      Public
      The official codebase for our paper "NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems"
      Python
      MIT License
      12310Updated Feb 28, 2026Feb 28, 2026
    • [ICLR2026] NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents
      Python
      MIT License
      2014510Updated Feb 27, 2026Feb 27, 2026
    • NGDBench

      Public
      Python
      0300Updated Feb 25, 2026Feb 25, 2026
    • DARK

      Public
      Code for DARK: Unifying Deductive and Abductive Reasoning in Knowledge Graphs with Masked Diffusion Model
      Python
      MIT License
      0400Updated Feb 11, 2026Feb 11, 2026
    • CtrlHGen

      Public
      Python
      0410Updated Feb 11, 2026Feb 11, 2026
    • AtlasKV

      Public
      [ICLR'26] AtlasKV: A scalable, effective, and general way to augment LLMs with billion-scale knowledge graphs using very little GPU memory cost.
      Python
      MIT License
      32010Updated Jan 27, 2026Jan 27, 2026
    • Python
      0200Updated Jan 18, 2026Jan 18, 2026
    • This repository contains the implementation of AutoSchemaKG, a novel framework for automatic knowledge graph construction that combines schema generation via co…
      Python
      MIT License
      9671980Updated Jan 14, 2026Jan 14, 2026
    • MarConf

      Public
      [ACL 2025] Revisiting Epistemic Markers in Confidence Estimation: Can Markers Accurately Reflect Large Language Models' Uncertainty?.
      Python
      1801Updated Nov 25, 2025Nov 25, 2025
    • Python
      MIT License
      32910Updated Nov 17, 2025Nov 17, 2025
    • privacy

      Public
      HTML
      0200Updated Nov 17, 2025Nov 17, 2025
    • CritiCal

      Public
      Code for CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?
      Python
      0510Updated Nov 15, 2025Nov 15, 2025
    • MARS

      Public
      Code and dataset for the paper: MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset (https://arxiv.o…
      Python
      MIT License
      0600Updated Nov 10, 2025Nov 10, 2025
    • [EMNLP2025] From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery
      MIT License
      4032504Updated Nov 5, 2025Nov 5, 2025
    • [ACL 2024] Implementation for Advancing Abductive Reasoning in Knowledge Graphs through Complex Logical Hypothesis Generation
      Python
      11500Updated Oct 9, 2025Oct 9, 2025
    • [EMNLP 2025 Wordplay] LLM-Hanabi Evaluating Multi-Agent Gameplays with Theory-of-Mind and Rationale Inference in Imperfect Information Collaboration Game
      Python
      0200Updated Oct 4, 2025Oct 4, 2025
    • Official Repository for MASLegalBench.
      Python
      MIT License
      0000Updated Sep 30, 2025Sep 30, 2025
    • MCIP

      Public
      Python
      MIT License
      21210Updated Sep 29, 2025Sep 29, 2025
    • Python
      0000Updated Sep 20, 2025Sep 20, 2025
    • Official Repository for Context Reasoner.
      Python
      MIT License
      0900Updated Sep 1, 2025Sep 1, 2025
    • CEQA

      Public
      Official Implementation of paper: Complex Query Answering on Eventuality Knowledge Graph with Implicit Logical Constraints
      Python
      MIT License
      11220Updated Jul 15, 2025Jul 15, 2025
    • FedNGDB

      Public
      Python
      1000Updated Jul 6, 2025Jul 6, 2025
    • TEGA

      Public
      [ACL 2025] Enhancing Transformers for Generalizable First-Order Logical Entailment
      Python
      MIT License
      0200Updated May 29, 2025May 29, 2025
    • ConKE

      Public
      0110Updated May 28, 2025May 28, 2025
    • Python
      0200Updated May 28, 2025May 28, 2025
    • [ACL 2025] KnowShiftQA: How Robust are RAG Systems when Textbook Knowledge Shifts in K-12 Education?
      Jupyter Notebook
      MIT License
      0200Updated May 25, 2025May 25, 2025
    • Python
      0000Updated May 25, 2025May 25, 2025
    ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.