Motiva

From the Greek μοτίβα, meaning patterns, or the recognization of similar features between objects.

This is a scoped-down reimplementation of Yente and nomenklatura, used to match entities against sanctions lists.

Most of the algorithms are taken directly from those repositories, and simply reimplemented, and the credit should go to the Open Sanctions's team.

Note that this piece of software requires Yente to run beside it, including Elasticsearch and a valid, licensed, collection of dataset obtained from Open Sanctions.

Work in progress

Scope and goals

Not all of Yente is going to be implemented here. Notably, none of the index updates feature are going to their way into this repository. We will focus on the request part (search and matching).

Even through we will strive to produce matching scores in the vicinity of those of Yente, exact scores are not a goal. In particular, the Rust implementations of some algorithms will produce slightly different results, resulting in different overall scores. This is, for example, the case of the algorithm transliterating scripts into latin, which do not use libicu by default, and might therefore produce slightly different results ^[1].

All implemented algorithms will feature an integration test comparing Motiva's score with Yente's and check they are within a reasonable epsilon of each other.

If at all possible, this project will try to use only Rust-native dependencies, and stay clear of integrating with C libraries through FFI ^[2].

Some liberty was taken to adapt some logic and algorithms from Yente, so do not expect fully-compliant API or behavior.

^[1]: Motiva can be compiled with the icu feature to use the same transliteration library as yente. This will require libicu development headers and shared libraries.

^[2]: With the default features configuration.

Implementation matrix

^[1]: Features that are disabled by default were omited for now.

Yente version compatibility

Before v0.5.0, motiva is only compatible with data indexer with Yente v4.x. Starting with v0.5.0, it will try to determine, at startup, which version of Yente was used to index the data (v4.x or v5.x), and adapt its queries to support it.

Configuration

Motiva is configured via environment variables. The following variables are supported:

Variable	Description	Default / Example
`ENV`	Environment (`dev` or `production`)	`dev`
`LISTEN_ADDR`	Address to bind the API server	`0.0.0.0:8000`
`API_KEY`	Bearer token used to authenticate requests	(none)
`INDEX_URL`	Elasticsearch URL	`http://localhost:9200`
`INDEX_AUTH_METHOD`	Elasticsearch authentication (`none`, `basic`, `bearer`, `api_key`, `encoded_api_key`)	`none`
`INDEX_CLIENT_ID`	Elasticsearch client ID (required for `basic` or `api_key`)	(none)
`INDEX_CLIENT_SECRET`	Elasticsearch client secret (required for `basic`, `api_key` or `encoded_api_key`)	(none)
`INDEX_TLS_CA_CERT`	Path to a PEM-encoded certificate chain to use for TLS validation	(none)
`INDEX_TLS_SKIP_VERIFY`	If `1`, do not validate the TLS certificate served by the Elasticsearch cluster	`0`
`INDEX_NAME`	Index prefix under which data was indexed (suffixed by `-entities`)	`yente`
`MANIFEST_URL`	Optional URL to a custom manifest JSON file	(none)
`CATALOG_REFRESH_INTERVAL`	Interval at which to pull the manifest and catalogs	1h
`MATCH_CANDIDATES`	Number of candidates to consider for matching	`10`
`ENABLE_PROMETHEUS`	Enable Prometheus metrics collection and /metrics endpoint	`0`
`ENABLE_TRACING`	Set to `1` to enable tracing	(none)
`TRACING_EXPORTER`	Tracing exporter kind (`otlp`, or `gcp` if compiled with the `gcp` feature)	`otlp`
`REQUEST_TIMEOUT`	Maximum duration for a match request	10s
`SCOPED_INDEX_QUERY`	Query used to scope down the index used for match queries	see here

Setting MANIFEST_FILE is required if you use a customized dataset list and would like your own manifest to be used for catalog generation. If omitted, the default manifest provided by Yente will be used. It requires either an HTTP URL or a local file path ending in .json, .yml or .yaml.

Motiva-specific features

Query options passed in body

Some unbounded-in-size query parameters can be passed in the request body instead of through the URL query. This prevents, for some of them taking in unbounded lists, to overflow the maximum length of URLs. Namely, you can now pass the following parameters in the body:

include_dataset
exclude_dataset
exclude_entity_ids

The match endpoint body now takes a params object at its root:

{
  "queries": [...],
  "params": {
    "include_datasets": [...],
    "exclude_datasets": [...],
    "exclude_entity_ids": [...]
  }
}

Scoped index

Motiva supports generating and using a trimmed down index for match queries, while keeping the full index for entity relation queries. This could allow improving performance of match queries if you are only interested in a subset of it, while keeping the full datasets for queries that are less time-sensitive.

For example, you could have a search index that only contains Person's that have sanction in their topics, while keeping the full index to retrieve details of an entity, enriched with all its relations. Depending on the query you use for the scoped index, you could see a great reduction in latency and resource consumption.

Motiva can be run with the create-scoped-index subcommand, which will take care of creating the scoped index and its aliases. Once it is done, restarting motiva will make it effective.

$ motiva create-scoped-index
2026-03-05T16:56:14.439865Z  INFO libmotiva::index::elastic::scoped: found previous scoped index index="motiva-w4xgo6jh"
2026-03-05T16:56:14.546981Z  INFO libmotiva::index::elastic::scoped: created new index, starting reindexing data index="motiva-9xtyeclx"
2026-03-05T16:56:24.030717Z  INFO libmotiva::index::elastic::scoped: reindexed data index="motiva-9xtyeclx"
2026-03-05T16:56:24.041981Z  INFO libmotiva::index::elastic::scoped: atomically swapped index from="motiva-w4xgo6jh" to="motiva-9xtyeclx"
2026-03-05T16:56:24.071765Z  INFO libmotiva::index::elastic::scoped: deleted old index index="motiva-w4xgo6jh"

The default scoped query is listed below, but can be customized through SCOPED_INDEX_QUERY.

{
  "bool": {
    "must": [
      { "terms": { "schema": [ "Person", "LegalEntity", "Organization", "Company", "Airplane", "Vessel" ] } },
      { "term": { "topics": "sanction" } }
    ]
  }
}

The scoped index is not kept automatically in sync with the full index, you would need to run motiva create-scoped-index again when you need to update it. We suggest running it after your regular indexing operations.

Once your scoped index is created, you can perform a /match request with the Motiva-specific ?index_type=scoped parameters for the new index to be used.

Run

$ cargo run --release
$ echo '{"queries":{"test":{"schema":"Person","properties":{"name":["Vladimir Putin"]}}}}' | curl -XPOST 127.0.0.1:8080/match/sanctions -H content-type:application/json -d @-

Development

Building

$ git clone --recurse-submodules git@github.com:apognu/motiva.git
$ cd motiva

Building

# Standard build
$ cargo build
# Build with libicu support (requires libicu-dev)
$ cargo build --release --features icu
# Build with GCP tracing support
$ cargo build --release --features gcp

Docker

Pre-built images are available in this repositor's packages section, at ghcr.io/apognu/motiva, for each combination of features. Alternatively, you can build the image thus:

# Build without libicu
$ docker build -t motiva .
# Build without standalone features
$ docker build --build-arg CARGO_ARGS="--features gcp" -t motiva:gcp .
# Build with libicu support
$ docker build --build-arg BASE=icu --build-arg CARGO_ARGS="--features icu" -t motiva:icu .

Test suite

To run the tests, a Python 3.13+ environment must be set up with the required dependencies (this include libicu). You can install it in a virtualenv by using the uv file at the root of this repository:

$ uv sync
$ cargo test

One quite lengthy test is ignored by default (scoring the cartesian product of 50x50 entities against each other) and compare it against nomenklatura. You can still run this test by running cargo test -- --include-ignored.

Contributing

Motiva is a work in progress.

Contributions and feedback are welcome! Please familiarize yourself with the CONTRIBUTING.md guidelines beforehand.

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
.cargo		.cargo
.github		.github
crates		crates
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
.rustfmt.toml		.rustfmt.toml
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Motiva

Scope and goals

Implementation matrix

Yente version compatibility

Configuration

Motiva-specific features

Query options passed in body

Scoped index

Run

Development

Building

Building

Docker

Test suite

Contributing

About

Uh oh!

Releases 10

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Motiva

Scope and goals

Implementation matrix

Yente version compatibility

Configuration

Motiva-specific features

Query options passed in body

Scoped index

Run

Development

Building

Building

Docker

Test suite

Contributing

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 10

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages