Seeing and Knowing in the Wild: Open-domain Visual Entity Recognition with Large-scale Knowledge Graphs via Contrastive Learning

Purpose of the project

This software is a research prototype, solely developed for and published as part of the publication Seeing and Knowing in the Wild: Open-domain Visual Entity Recognition with Large-scale Knowledge Graphs via Contrastive Learning which has been accepted by AAAI 2026. This repository provides the official implementation of KnowCoL, a Knowledge-guided Contrastive Learning framework presented in our paper.

Introduction

KnowCoL integrates visual features, text queries, and structured knowledge (e.g., Wikidata relations and Wikipedia descriptions) into a shared semantic space, enabling strong zero-shot generalization and robust disambiguation. The approach significantly improves recognition accuracy for rare and unseen entities while remaining lightweight compared to generative baselines.

Requirements

Create and activate your environment

conda env create -f environment.yml
conda activate my-env

Dataset

Download the OVEN dataset from HuggingFace here Download Wikidata subgraph for OVEN benchmark here Download Wikipedia knowledge base for OVEN benchmark here Place the downloaded data under the appropriate directory expected by the datamodule. E.g.,

dataset/
├── oven_data/               # Processed OVEN annotations
├── oven_images/             # Image files associated with OVEN
├── test_data/               # Test split for evaluation
├── wikidata_subgraph_v1/    # Extracted Wikidata subgraph
│   ├── entity.txt           # List of entity IDs and labels
│   ├── relation.txt         # List of relation IDs and names
│   ├── triplet_h.jsonl      # Head-anchored knowledge graph triples
│   └── triplet_t.jsonl      # Tail-anchored knowledge graph triples
└── knowledge_base
    ├── wikipedia_images_full # contains the lead images on the Wikipedia
    ├── Wiki6M_ver_1_1.jsonl  # contains image paths of the entities. 
    └── wikidata_relation_1_1.jsonl # contains text descriptions of the entities.

Training

python3 knowcol/training.py

config options:

model.beta1: hyperparameter beta1
model.beta2: hyperparameter beta2
datamodule.batch_size: batch size for training
trainer.max_epochs: epochs to train ...

Testing

python3 knowcol/evaluations/oven_eval.py

specify the checkpoint and model in the python file

Reference

If you think this work is interesting, please consider to cite:

@article{
    Zhou_Halilaj_Monka_Schmid_Zhu_Wu_Nazer_Staab_2026,
    title={Seeing and Knowing in the Wild: Open-domain Visual Entity Recognition with Large-scale Knowledge Graphs via Contrastive Learning},
    volume={40},
    url={https://ojs.aaai.org/index.php/AAAI/article/view/38370}, DOI={10.1609/aaai.v40i16.38370},
    number={16},
    journal={Proceedings of the AAAI Conference on Artificial Intelligence},
    author={Zhou, Hongkuan and Halilaj, Lavdim and Monka, Sebastian and Schmid, Stefan and Zhu, Yuqicheng and Wu, Jingcheng and Nazer, Nadeem and Staab, Steffen},
    year={2026},
    month={Mar.},
    pages={13638-13646}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
conf		conf
figures		figures
knowcol		knowcol
notebooks		notebooks
.DS_Store		.DS_Store
3rd-party-licenses.txt		3rd-party-licenses.txt
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Seeing and Knowing in the Wild: Open-domain Visual Entity Recognition with Large-scale Knowledge Graphs via Contrastive Learning

Purpose of the project

Introduction

Requirements

Create and activate your environment

Dataset

Training

Testing

Reference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Seeing and Knowing in the Wild: Open-domain Visual Entity Recognition with Large-scale Knowledge Graphs via Contrastive Learning

Purpose of the project

Introduction

Requirements

Create and activate your environment

Dataset

Training

Testing

Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages