bottleneck identification using instructor and pydantic output models. by bhupatiraju · Pull Request #51 · dime-worldbank/mega-boost

bhupatiraju · 2025-06-26T18:11:43Z

This branch contains the refactored code from the PFM-bottleneck extraction project.

The data_extraction notebook deals with extraction of the PEF and PFR docs from the WB docs API, stores a metadata table, and constructs a table of chunks from the chunked document texts. This table of chunks is accessed the LLM pipeline works on the chunks from here.

The runner (notebook) is the entry point for running the LLM pipeline. The models folder contains the Pydantic models defining the output structure from the LLMs. Currently, its organized into the extraction and validation stages and related models. An alternate option was to have the models organized bottleneck-wise such that extraction and validation models for that specific bottleneck remain in one place. For now, since we had only 4 bottlenecks in place this is the organization I went with.

The construction of the client is separated into the azure_service.py file, and the prompts.py file contains the methods to help format the required prompt in the extraction and validation stages and finally the consts contains the LLM model signature and the initial descriptions of the bottleneck which is subsequently used in formatting the prompts.

yukinko-iwasaki · 2025-06-26T19:59:31Z

This is great! thanks for sharing your code!
Should we create a separate repo for this? This looks irrelevant to our boost project.
I could request to create a new repo under dime-worldbank.

bottleneck identification using instructor and pydantic output models.

f510868

bhupatiraju requested review from weilu and yukinko-iwasaki June 26, 2025 18:11

temp

20dba13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bottleneck identification using instructor and pydantic output models.#51

bottleneck identification using instructor and pydantic output models.#51
bhupatiraju wants to merge 2 commits intomainfrom
PFM-bottlenecks

bhupatiraju commented Jun 26, 2025

Uh oh!

yukinko-iwasaki commented Jun 26, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bhupatiraju commented Jun 26, 2025

Uh oh!

yukinko-iwasaki commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yukinko-iwasaki commented Jun 26, 2025 •

edited

Loading