Skip to content

Conversation

@kmoscoe
Copy link
Contributor

@kmoscoe kmoscoe commented Jan 29, 2026

This PR also:

  • Adds a definition of topics to the glossary
  • Cleans up an obsolete glossary entry

Staged at https://bullie.svl.corp.google.com:4000

kmoscoe and others added 30 commits June 11, 2025 19:07
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @kmoscoe, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the Data Commons Resolve API by extending its functionality beyond geographical places to include statistical variables and topics. It introduces new parameters and a dedicated method in the Python client library, along with corresponding updates to the REST API documentation. The changes aim to provide developers with more powerful and flexible tools for identifying and retrieving specific data entities within the Data Commons knowledge graph, while also improving the overall clarity and accuracy of the documentation.

Highlights

  • Expanded Resolve API Capabilities: The Resolve API now supports resolving statistical variables and topics in addition to place entities, offering more comprehensive data retrieval.
  • New Python API Method: A new fetch_indicators method has been introduced in the Python API for directly resolving statistical variables and topics.
  • Updated Python fetch Method: The existing fetch method in the Python API has been enhanced with resolver and target parameters, allowing for more flexible entity resolution, including statistical variables and topics.
  • REST API Enhancements: The REST API's resolve endpoint has been updated to include resolver and target parameters, aligning its functionality with the new Python API capabilities.
  • Comprehensive Documentation for Resolve API: Detailed explanations of new parameters, updated response structures, and new examples have been added for both Python and REST Resolve APIs to reflect the expanded functionality.
  • Glossary Updates: A new definition for 'Topic' has been added to the glossary, and an obsolete 'Cohort' entry has been removed for clarity.
  • API Key Requirements Clarification: The documentation for API key requirements now explicitly includes Data Commons MCP server requests.
  • Python Node API Method Renamed: The fetch_place_place_descendants method in the Python Node API has been corrected to fetch_place_descendants.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request does a great job of documenting the new parameters for the Resolve API, especially the ability to resolve statistical variables and topics. The addition of the 'Topic' definition to the glossary and the cleanup of the obsolete 'Cohort' entry are also valuable improvements. I've identified a few minor inconsistencies in the documentation for both the Python and REST APIs, mainly regarding parameter and field types not matching the examples. Addressing these will enhance the clarity and accuracy of the documentation for developers.

kmoscoe and others added 2 commits January 29, 2026 15:06
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@kmoscoe kmoscoe requested review from clincoln8 and keyurva January 29, 2026 23:09
Copy link
Contributor

@clincoln8 clincoln8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Kara! Left a few nits, but looks great overall!

You can also query for statistical variables and topics. For example, you could find the DCIDs for all statistical variables containing the string "population".

> **Note**: Currently, this endpoint only supports [place](/glossary.html#place) entities.
Note that you can only resolve entities by some terminal properties. You cannot resolve properties that represent linked entities with incoming or outgoing arc relationships. For that, you need to use the [Node](node.md) API. For example, if you wanted to get all the DCIDs of entities that are related to a given entity by the `containedInPlace` property (say, all states in the United States), use the Node API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way our endpoint is currently written, that we don't really need this note.
I believe the vision was that you'd be able to supply any property in the property= <-{property}->dcid request param and resolve on it. This would warrant this note of only using terminal properties. However that logic is not implemented and we make it very explicit that only "description" is valid for indicators and "description|wikidataId|geoCoordinates" is valid for places.
So this note seems unnecessary for now and could be added back if we ever support resolving based on generic properties.

Suggested change
Note that you can only resolve entities by some terminal properties. You cannot resolve properties that represent linked entities with incoming or outgoing arc relationships. For that, you need to use the [Node](node.md) API. For example, if you wanted to get all the DCIDs of entities that are related to a given entity by the `containedInPlace` property (say, all states in the United States), use the Node API.

We could keep the part about "use Node API if you want to fetch more data about each returned candidate"

the US state).

Note that you can only resolve entities by some terminal properties. You cannot resolve properties that represent linked entities with incoming or outgoing arc relationships. For that, you need to use the [Node](node.md) API. For example, if you wanted to get all the DCIDs of entities that are related to a given entity by the `containedInPlace` property (say, all states in the United States), use the Node API.
You can also query for statistical variables and topics. For example, you could find the DCIDs for all statistical variables containing the string "population".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
You can also query for statistical variables and topics. For example, you could find the DCIDs for all statistical variables containing the string "population".
You can also query for statistical variables and topics. For example, you could find the DCIDs for all statistical variables related to the string "population".

Resolve is based on semantic embeddings, not text based search.

| [fetch_dcid_by_coordinates](#fetch_dcid_by_coordinates) | Look up a DCID of a single place by geographical coordinates. |
| [fetch_indicators](#fetch_indicators) | Look up the DCIDs of all matching statistical variables and topics. |

## Response
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make sense to list the default value response first -- for resolving places, response looks like xyz. For the fetch_indicators and fetch w/resolver=indicator response looks like abc.

| dominantType | string | Optional field which, where present, disambiguates between multiple results. |
| node | string | The query terms used to look up the DCIDs of entities. |
| candidates | list | List of nodes that match the query terms. |
| dominantType | string | Optional field which, when present, disambiguates between multiple results. Only returned when `resolver` is set to `place` (the default). |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: dcid is not listed as a field, is that intentional? i'm not sure if it's necessary or not since it seems self explanatory.

| expression <br /> <required-tag>Required</required-tag> | string | An expression that describes the identifier used in the `node_ids` parameter. Only three are currently supported: <br />`<-description`: Search for nodes based on name-related properties (such as `name`, `alternateName`, etc.).<br/>`<-wikidataId`: Search for nodes based on their Wikidata ID(s).<br/>`<-geoCoordinates`: Search for nodes based on latitude and/or longitude.<br/> Note that these are not necessarily "properties" that appear in the knowledge graph; instead, they are "synthetic" attributes that cover searches over multiple properties. <br/>Each expression must end with `->dcid` and may optionally include a [`typeOf` filter](/api/rest/v2/index.html#filters). |
| node_ids <br /> <required-tag>Required</required-tag> | string or list of strings | A list of terms that identify each node to search for, such as their names. A single string can contain spaces and commas. |
| resolver <br /> <optional-tag>Optional</optional-tag> | string literal | Currently accepted options are `place` (the default) and `indicator`, which resolves statistical variables. If not specified, the default is `place`. |
| expression <br /> <optional-tag>Optional</optional-tag> | string | An expression that describes the identifier used in the `nodes` parameter. Only three are currently supported:<br />`<-description`: Search for nodes based on name-related properties (such as `name`, `alternateName`, etc.).<br/>`<-wikidataId`: Search for nodes based on their Wikidata ID(s).<br/>`<-geoCoordinates`: Search for nodes based on latitude and/or longitude. <br/>If not specified, the default is `<-description`. <br/>Each expression must end with `->dcid` and may optionally include a [`typeOf` filter](/api/rest/v2/index.html#filters). <br/><b>Note:</b> To specify `wikidataId`,`geoCoordinates`, or a `typeOf` filter on the query, you must specify this parameter. <br/> Note: The `description` field is not necessarily present in the knowledge graph for all entities. It is a synthetic property that Data Commons uses to check various name-related fields, such as `name`. The `geoCoordinates` field is a synthesis of `latitude` and `longitude` properties. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| expression <br /> <optional-tag>Optional</optional-tag> | string | An expression that describes the identifier used in the `nodes` parameter. Only three are currently supported:<br />`<-description`: Search for nodes based on name-related properties (such as `name`, `alternateName`, etc.).<br/>`<-wikidataId`: Search for nodes based on their Wikidata ID(s).<br/>`<-geoCoordinates`: Search for nodes based on latitude and/or longitude. <br/>If not specified, the default is `<-description`. <br/>Each expression must end with `->dcid` and may optionally include a [`typeOf` filter](/api/rest/v2/index.html#filters). <br/><b>Note:</b> To specify `wikidataId`,`geoCoordinates`, or a `typeOf` filter on the query, you must specify this parameter. <br/> Note: The `description` field is not necessarily present in the knowledge graph for all entities. It is a synthetic property that Data Commons uses to check various name-related fields, such as `name`. The `geoCoordinates` field is a synthesis of `latitude` and `longitude` properties. |
| expression <br /> <optional-tag>Optional</optional-tag> | string | An expression that describes the identifier used in the `nodes` parameter. Only three are currently supported:<br />`<-description`: Search for nodes based on name-related properties (such as `name`, `alternateName`, etc.).<br/>`<-wikidataId`: Search for nodes based on their Wikidata ID(s). (for place resolution only) <br/>`<-geoCoordinates`: Search for nodes based on latitude and/or longitude. for place resolution only) <br/>If not specified, the default is `<-description`. <br/>Each expression must end with `->dcid` and may optionally include a [`typeOf` filter](/api/rest/v2/index.html#filters). <br/><b>Note:</b> To specify `wikidataId`,`geoCoordinates`, or a `typeOf` filter on the query, you must specify this parameter. <br/> Note: The `description` field is not necessarily present in the knowledge graph for all entities. It is a synthetic property that Data Commons uses to check various name-related fields, such as `name`. The `geoCoordinates` field is a synthesis of `latitude` and `longitude` properties. |


<div id="GET-request" class="api-tabcontent api-signature">
https://api.datacommons.org/v2/resolve?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=<var>IDENTIFIER_LIST</var>&property=<var>EXPRESSION</var>
https://api.datacommons.org/v2/resolve?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=<var>IDENTIFIER_LIST</var>&resolver=<var>NODE_TYPE</var>&property=<var>EXPRESSION</var>&target=<var>INSTANCE</var>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
https://api.datacommons.org/v2/resolve?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=<var>IDENTIFIER_LIST</var>&resolver=<var>NODE_TYPE</var>&property=<var>EXPRESSION</var>&target=<var>INSTANCE</var>
https://api.datacommons.org/v2/resolve?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=<var>IDENTIFIER_LIST</var>&resolver=<var>RESOLUTION_TYPE</var>&property=<var>EXPRESSION</var>&target=<var>INSTANCE</var>

...
],
"property": "<var>EXPRESSION</var>"
"resolver": "<var>NODE_TYPE</var>",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"resolver": "<var>NODE_TYPE</var>",
"resolver": "<var>RESOLUTION_TYPE</var>",

@@ -116,7 +168,10 @@ The response looks like:
|-------------|--------|-------------------------------------|
| node | string | The property value or description provided. |
| candidates | list | DCIDs matching the description you provided. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

candidates is a list of nodes. each node contains dcid, dominantType (optional), metadata (optional), typeOf (optional)


(truncated)

```json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
```json
```jsonc

Comment on lines +561 to +579
"dcid": "Count_Person_18OrMoreYears",
"metadata": {
"score": "0.8167",
"sentence": "adult population count"
},
"typeOf": [
"StatisticalVariable"
]
},
{
"dcid": "Count_Person_Upto18Years",
"metadata": {
"score": "0.8121",
"sentence": "children population count"
},
"typeOf": [
"StatisticalVariable"
]
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"dcid": "Count_Person_18OrMoreYears",
"metadata": {
"score": "0.8167",
"sentence": "adult population count"
},
"typeOf": [
"StatisticalVariable"
]
},
{
"dcid": "Count_Person_Upto18Years",
"metadata": {
"score": "0.8121",
"sentence": "children population count"
},
"typeOf": [
"StatisticalVariable"
]
},
// ...
]}]}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants