Embeddings-Similarity Rating (ESR)

A Python package implementing the Embeddings-Similarity Rating methodology for converting LLM textual responses to Likert scale probability distributions using semantic similarity against reference statements.

Overview

The ESR methodology addresses the challenge of mapping rich textual responses from Large Language Models (LLMs) to structured Likert scale ratings. Instead of forcing a single numerical rating, ESR preserves the inherent uncertainty and nuance in textual responses by generating probability distributions over all possible Likert scale points.

This package provides a distilled, reusable implementation of the ESR methodology described in the paper "Measuring Synthetic Consumer Purchase Intent Using Embeddings-Similarity Ratings" by Maier & Aslak (2025).

Installation

Local Development

To install this package locally for development, run:

pip install -e .

From GitHub Repository

To install this package into your own project from GitHub, run:

pip install git+https://github.com/pymc-labs/embeddings-similarity-rating.git

Quick Start

import polars as po
import numpy as np
from embeddings_similarity_rating import ResponseRater

# Create example reference sentences dataframe
reference_set_1 = [
    "Strongly disagree",
    "Disagree",
    "Neutral",
    "Agree",
    "Strongly agree",
]
reference_set_2 = [
    "Disagree a lot",
    "Kinda disagree",
    "Don't know",
    "Kinda agree",
    "Agree a lot",
]
df = po.DataFrame(
    {
        "id": ["set1"] * 5 + ["set2"] * 5,
        "int_response": [1, 2, 3, 4, 5] * 2,
        "sentence": reference_set_1 + reference_set_2,
    }
)

# Initialize rater
rater = ResponseRater(df)

# Create some example synthetic consumer responses
llm_responses = ["I totally agree", "Not sure about this", "Completely disagree"]

# Get PMFs for synthetic consumer responses
pmfs = rater.get_response_pmfs(
    reference_set_id="set1",      # Reference set to score against, or "mean"
    llm_responses=llm_responses,  # List of LLM responses to score
    temperature=1.0,              # Temperature for scaling the PMF
    epsilon=0.0,                  # Small regularization parameter to prevent division by zero and add smoothing
)

# Get survey response PMF
survey_pmf = rater.get_survey_response_pmf(pmfs)

print(survey_pmf)

Methodology

The ESR methodology works by:

Defining reference statements for each Likert scale point
Computing cosine similarities between LLM response embeddings and reference statement embeddings
Converting similarities to probability distributions using minimum similarity subtraction and normalization
Optionally applying temperature scaling for distribution control

Core Components

ResponseRater: Main class implementing the ESR methodology
get_response_pmfs(): Convert LLM response embeddings to PMFs using specified reference set

Citation

Maier, B. F., & Aslak, U. (2025). Measuring Synthetic Consumer Purchase Intent Using Embeddings-Similarity Ratings.

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.github		.github
embeddings_similarity_rating		embeddings_similarity_rating
tests		tests
.codespellignore		.codespellignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pixi.lock		pixi.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Embeddings-Similarity Rating (ESR)

Overview

Installation

Local Development

From GitHub Repository

Quick Start

Methodology

Core Components

Citation

License

About

Uh oh!

Releases

Packages

Languages

License

pymc-labs/embeddings-similarity-rating

Folders and files

Latest commit

History

Repository files navigation

Embeddings-Similarity Rating (ESR)

Overview

Installation

Local Development

From GitHub Repository

Quick Start

Methodology

Core Components

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages