Skip to content

meaningfy-ws/entity-resolution-spec

 
 

Repository files navigation

Entity Resolution Specifications

Formal software contract, shared data models, sample messages, and compliance tests required for integrating new Entity Resolution Engines (EREs) into the system.

Note: Active development continues in the OP-TED repository: https://github.com/OP-TED/entity-resolution-spec

Requirements

  • UNIX-compatible environment (Linux/macOS/WSL2)
  • Make
  • Python 3.12+
  • Poetry (for dependency management)

Quick Start

make install      # install dependencies via Poetry
make all          # generate all models, schemas, and documentation

Make targets overview

  • install: install dependencies via Poetry
  • all: generate all models, schemas, and documentation
  • generate-models: regenerate Pydantic models and JSON Schema from LinkML
  • generate-docs: regenerate documentation
  • lint: run ruff linter on source code
  • lint-schema: run LinkML linter on YAML schemas
  • clean: remove all generated artifacts

Installation

To get started, you need a UNIX-compatible environment (Mac/Linux/WSL2) with Make, Python and Poetry. You can then use the following command to setup your environment:

make install

This will install the necessary user dependencies in a Poetry-managed virtual environment.

Development

This project uses principles of model-driven development (MDD) and domain-driven design (DDD). The core models are defined in the resources/schemas directory using LinkML, and the Python (Pydantic) models are generated from these specifications.

Generated Python models are in src/erspec/models. Regenerate them with:

make all

This regenerates both the LinkML-based models (Python, JSONSchema) and the navigable documentation. See the Makefile for more granular targets.

Running and Testing

TODO: this will be added in future. Right now, this repository contains specifications only and does not have runnable unit tests.

Test data

Deduplicated notices

This repository contains manual deduplication for organizations and procedures from RDF tender notices. The duplication was done using fuzzy string matching with manual checking of the results.

Details here

Documentation Overview

Documentation resources for understanding the model, architecture, and interfaces:

Model Schema Docs

See docs/schema/README.md — canonical data model and service schema documentation generated from the ERS–ERE definitions.

Architectural Diagrams

See docs/architecture/diagrams/README.md — prescribed architectural diagrams illustrating system structure and components.

Sequence Diagrams (Mermaid)

See docs/architecture/sequence_diagrams/README.md — Mermaid-format sequence diagrams describing key system interactions.

Informative Interface Sequence

See docs/ere-interface-seq-diag.md — informative sequence overview for ERS–ERE interactions. Note: the ERS–ERE contract is the normative specification; this file is provided for additional context.

About

Formal software contract, shared data models, sample messages, and compliance tests required for integrating new Entity Resolution Engines (EREs) into the system.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Python 64.4%
  • Gherkin 15.5%
  • Makefile 14.1%
  • Jinja 6.0%