Skip to content

feat: add number of indexed indicators to available datasets tool output #191#192

Open
Fedir-Yatsenko wants to merge 1 commit intodevelopmentfrom
feat/191-add-number-of-indexed-indicators-to-available-datasets-tool-output
Open

feat: add number of indexed indicators to available datasets tool output #191#192
Fedir-Yatsenko wants to merge 1 commit intodevelopmentfrom
feat/191-add-number-of-indexed-indicators-to-available-datasets-tool-output

Conversation

@Fedir-Yatsenko
Copy link
Collaborator

@Fedir-Yatsenko Fedir-Yatsenko commented Mar 6, 2026

Applicable issues

Description of changes

Add optional per-dataset and total indicator counts to the available datasets tool response, controlled by include_indicator_count flag (disabled by default) in AvailableDatasetsDetails.

  • Add get_size_per_version to VectorStore base and PgVectorStore (single GROUP BY query to avoid N+1)
  • Add get_indicator_counts to ChannelServiceFacade
  • Pass indicator counts through formatters (simple, detailed, list)
  • Add i18n strings for EN and UK locales

Checklist

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

…put #191

Add optional per-dataset and total indicator counts to the available
datasets tool response, controlled by `include_indicator_count` flag
(disabled by default) in `AvailableDatasetsDetails`.

- Add `get_size_per_version` to VectorStore base and PgVectorStore
  (single GROUP BY query to avoid N+1)
- Add `get_indicator_counts` to ChannelServiceFacade
- Pass indicator counts through formatters (simple, detailed, list)
- Add i18n strings for EN and UK locales

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Fedir-Yatsenko Fedir-Yatsenko requested a review from navalnica March 6, 2026 13:14
@Fedir-Yatsenko Fedir-Yatsenko self-assigned this Mar 6, 2026
@Fedir-Yatsenko Fedir-Yatsenko requested a review from ypldan as a code owner March 6, 2026 13:14
@Fedir-Yatsenko Fedir-Yatsenko added enhancement New feature or request python Pull requests that update python code labels Mar 6, 2026
@Fedir-Yatsenko
Copy link
Collaborator Author

Example output from the Available Datasets tool.

Total datasets: 2
Total providers: 2
Total number of indicators: 1340

Provider: IMF.RES

Total datasets from this provider: 1

  • IMF.RES:WEO - Sample
    • Source ID: IMF.RES:WEO(9.0.0)
    • Description: The World Economic Outlook (WEO) database contains selected macroeconomic data series from the statistical appendix of the World Economic Outlook report, which presents the IMF staff's analysis and projections of economic developments at the global level, in major country groups and in many individual countries. The WEO is released in April and September/October each year. Use this database to find data on national accounts, gross domestic product (GDP), inflation, unemployment rates, balance of payments, fiscal indicators, trade for countries and country groups (aggregates), and commodity prices whose data are reported by the IMF. Data are available from 1980 to the present, and projections are given for the next two years. Additionally, medium-term projections are available for selected indicators. For some countries, data are incomplete or unavailable for certain years.

Provider: IMF.STA

Total datasets from this provider: 1

  • IMF.STA:BOP - Sample
    • Source ID: IMF.STA:BOP(21.0.0)
    • Description: The Balance of Payments (BOP) is a statistical statement that summarizes transactions between residents and nonresidents during a period. It consists of the goods and services account, the primary income account, the secondary income account, the capital account, and the financial account.
    • Provider: IMF.STA
    • Last Updated: Jun 2025
    • URL: https://data.imf.org/en/datasets/IMF.STA:BOP
    • Number of indicators: 1195

@Fedir-Yatsenko
Copy link
Collaborator Author

Fedir-Yatsenko commented Mar 6, 2026

/deploy-review

GitHub actions run: 22765968044
Environment URL: review-environment | pipeline

Comment on lines +575 to +577
async def get_size_per_version(self, version_ids: set[int]) -> dict[int, int]:
"""Returns the number of documents per version_id in a single query."""
metadata_model = await self._get_metadata_model()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's reuse docstrings from base class methods, and update it for the new function:

@abstractmethod
    async def get_total_size(self) -> int:
        """Returns the total number of documents in the vector store."""

    @abstractmethod
    async def get_size(self, version_ids: set[int]) -> int:
        """Returns the number of documents in the vector store for the specified version IDs."""

    @abstractmethod
    async def get_size_per_version(self, version_ids: set[int]) -> dict[int, int]:
        """Returns the number of documents grouped by version_id."""

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add number of indexed indicators to available datasets tool output

2 participants