Skip to content

SCHOL-273: Run /chat API tests in local builds#1025

Open
jdohan wants to merge 46 commits intomainfrom
SCHOL-273/assistant-api-tests-setup
Open

SCHOL-273: Run /chat API tests in local builds#1025
jdohan wants to merge 46 commits intomainfrom
SCHOL-273/assistant-api-tests-setup

Conversation

@jdohan
Copy link
Copy Markdown
Contributor

@jdohan jdohan commented Mar 25, 2026

Changes summary

Enables running tests for the /chat API endpoint against a fresh local dockerized build by introducing two scripts to generate and seed the local Postgres database with FRBR graph data associated with two known editions.

Only two editions were chosen as of right now since they're always returned for the prompt currently used in the corresponding /chat smoke tests. Even though /chat responses are non-deterministic, follow-up work that wires hybrid search to a dedicated testing namespace in the vector DB for builds in CI (and local, if desired) will minimize the potential for test flakiness.

Tests supported

The /chat API tests—while they exist in this feature branch—were not directly added to it and they do not yet exist in main (see #947) due to setup blockers resolved by NOREF/aws-auth-docker-compose, which pulled in the tests along with the crucial setup. This branch was merged with that one, therefore this PR includes the tests themselves and the ability to run them in CI.

Fetching data for the local Postgres DB

The script that handles generating the seed data has been committed to the dev-scripts folder along with these changes so that it can be used continually to add more as the test suite scales. If a different location for this script is desired (e.g. within the test suite), please indicate so.

Question:
Is there anything sensitive in the current seed data file? If so, it can instead be generated fresh each time in CI then cleaned up after, however that introduces an external dependency during test setup (i.e. reading from the production Postgres DB) along with increased run times.

Tests workflow

A new step has been added to the respective workflow to handle local seeding before the VRA integration tests are run.

Question:
Is there a different ordering desired for this step?

Re: ordering

The functional test run in CI was relocated so that it's run after integration tests due to a failure that was occurring which ended the job before the impact of the changes within immediate scope could be witnessed in CI, however that failure is no longer occurring in the latest commit (unsure why, but it isn't).

Is there any objection to keeping the test run ordering this way?

  1. Unit tests
  2. Integration tests for DRB (some of which may no longer be relevant once VRA goes live)
  3. Integration tests for VRA
    • /chat API tests
  4. Functional tests

Functional tests are usually run after integration tests in a typical layered testing approach, but there is value in running them first to fail faster in CI if there is something awry.

Documentation

The README has been updated to include instructions on using this new alternate method of seeding a local DB.

How to test

Start by running make down-clean from etl-pipeline/ to wipe any existing Docker containers, networks, volumes, etc., since the goal of these changes is to ensure VRA non-unit tests can run in CI against freshly-built local dockerized services.

Then run the commands given in the docstring comment at the top of seed_frbr_data.py followed by make intregration-vra (defaults to local without specifying ENVIRONMENT) while local dockerized services are up.

Ensure all tests pass locally and in CI.

Perhaps more importantly, inspect the /chat response yielded from the test query to verify it is well-formed. Reviewers added to this PR are more familiar with that response than I am currently. Note that while logs may indicate hits on editions other than the two with associated FRBR data, only those two are returned as results.

jdohan and others added 30 commits February 20, 2026 15:35
ONLY for backend/etl-pipeline!
- allow refresh of tokens in docker container

- remove dummy aws env vars

- add google api key fo docker env

- update create vra user/password fixture
- document
- improve error clarity

frbrized_record_data
- read post ETL pipeline values by record.source_id
to prevent confusion is cases where the same record
data inserted multiple times into the DB resulting
in multiple Records with the same source_id
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 25, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
digital-research-books Ready Ready Preview, Comment Mar 31, 2026 9:11pm

Request Review

@jdohan jdohan changed the title SCHOL-273: Seed local database for VRA API tests SCHOL-273: Run /chat API tests in CI Mar 26, 2026
@jdohan jdohan changed the title SCHOL-273: Run /chat API tests in CI SCHOL-273: Run /chat API tests in local builds Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants