-
Notifications
You must be signed in to change notification settings - Fork 36
Add documentation of new Resolve API parameters #670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @kmoscoe, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the Data Commons Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request does a great job of documenting the new parameters for the Resolve API, especially the ability to resolve statistical variables and topics. The addition of the 'Topic' definition to the glossary and the cleanup of the obsolete 'Cohort' entry are also valuable improvements. I've identified a few minor inconsistencies in the documentation for both the Python and REST APIs, mainly regarding parameter and field types not matching the examples. Addressing these will enhance the clarity and accuracy of the documentation for developers.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
clincoln8
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Kara! Left a few nits, but looks great overall!
| You can also query for statistical variables and topics. For example, you could find the DCIDs for all statistical variables containing the string "population". | ||
|
|
||
| > **Note**: Currently, this endpoint only supports [place](/glossary.html#place) entities. | ||
| Note that you can only resolve entities by some terminal properties. You cannot resolve properties that represent linked entities with incoming or outgoing arc relationships. For that, you need to use the [Node](node.md) API. For example, if you wanted to get all the DCIDs of entities that are related to a given entity by the `containedInPlace` property (say, all states in the United States), use the Node API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the way our endpoint is currently written, that we don't really need this note.
I believe the vision was that you'd be able to supply any property in the property= <-{property}->dcid request param and resolve on it. This would warrant this note of only using terminal properties. However that logic is not implemented and we make it very explicit that only "description" is valid for indicators and "description|wikidataId|geoCoordinates" is valid for places.
So this note seems unnecessary for now and could be added back if we ever support resolving based on generic properties.
| Note that you can only resolve entities by some terminal properties. You cannot resolve properties that represent linked entities with incoming or outgoing arc relationships. For that, you need to use the [Node](node.md) API. For example, if you wanted to get all the DCIDs of entities that are related to a given entity by the `containedInPlace` property (say, all states in the United States), use the Node API. |
We could keep the part about "use Node API if you want to fetch more data about each returned candidate"
| the US state). | ||
|
|
||
| Note that you can only resolve entities by some terminal properties. You cannot resolve properties that represent linked entities with incoming or outgoing arc relationships. For that, you need to use the [Node](node.md) API. For example, if you wanted to get all the DCIDs of entities that are related to a given entity by the `containedInPlace` property (say, all states in the United States), use the Node API. | ||
| You can also query for statistical variables and topics. For example, you could find the DCIDs for all statistical variables containing the string "population". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
| You can also query for statistical variables and topics. For example, you could find the DCIDs for all statistical variables containing the string "population". | |
| You can also query for statistical variables and topics. For example, you could find the DCIDs for all statistical variables related to the string "population". |
Resolve is based on semantic embeddings, not text based search.
| | [fetch_dcid_by_coordinates](#fetch_dcid_by_coordinates) | Look up a DCID of a single place by geographical coordinates. | | ||
| | [fetch_indicators](#fetch_indicators) | Look up the DCIDs of all matching statistical variables and topics. | | ||
|
|
||
| ## Response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might make sense to list the default value response first -- for resolving places, response looks like xyz. For the fetch_indicators and fetch w/resolver=indicator response looks like abc.
| | dominantType | string | Optional field which, where present, disambiguates between multiple results. | | ||
| | node | string | The query terms used to look up the DCIDs of entities. | | ||
| | candidates | list | List of nodes that match the query terms. | | ||
| | dominantType | string | Optional field which, when present, disambiguates between multiple results. Only returned when `resolver` is set to `place` (the default). | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: dcid is not listed as a field, is that intentional? i'm not sure if it's necessary or not since it seems self explanatory.
| | expression <br /> <required-tag>Required</required-tag> | string | An expression that describes the identifier used in the `node_ids` parameter. Only three are currently supported: <br />`<-description`: Search for nodes based on name-related properties (such as `name`, `alternateName`, etc.).<br/>`<-wikidataId`: Search for nodes based on their Wikidata ID(s).<br/>`<-geoCoordinates`: Search for nodes based on latitude and/or longitude.<br/> Note that these are not necessarily "properties" that appear in the knowledge graph; instead, they are "synthetic" attributes that cover searches over multiple properties. <br/>Each expression must end with `->dcid` and may optionally include a [`typeOf` filter](/api/rest/v2/index.html#filters). | | ||
| | node_ids <br /> <required-tag>Required</required-tag> | string or list of strings | A list of terms that identify each node to search for, such as their names. A single string can contain spaces and commas. | | ||
| | resolver <br /> <optional-tag>Optional</optional-tag> | string literal | Currently accepted options are `place` (the default) and `indicator`, which resolves statistical variables. If not specified, the default is `place`. | | ||
| | expression <br /> <optional-tag>Optional</optional-tag> | string | An expression that describes the identifier used in the `nodes` parameter. Only three are currently supported:<br />`<-description`: Search for nodes based on name-related properties (such as `name`, `alternateName`, etc.).<br/>`<-wikidataId`: Search for nodes based on their Wikidata ID(s).<br/>`<-geoCoordinates`: Search for nodes based on latitude and/or longitude. <br/>If not specified, the default is `<-description`. <br/>Each expression must end with `->dcid` and may optionally include a [`typeOf` filter](/api/rest/v2/index.html#filters). <br/><b>Note:</b> To specify `wikidataId`,`geoCoordinates`, or a `typeOf` filter on the query, you must specify this parameter. <br/> Note: The `description` field is not necessarily present in the knowledge graph for all entities. It is a synthetic property that Data Commons uses to check various name-related fields, such as `name`. The `geoCoordinates` field is a synthesis of `latitude` and `longitude` properties. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| | expression <br /> <optional-tag>Optional</optional-tag> | string | An expression that describes the identifier used in the `nodes` parameter. Only three are currently supported:<br />`<-description`: Search for nodes based on name-related properties (such as `name`, `alternateName`, etc.).<br/>`<-wikidataId`: Search for nodes based on their Wikidata ID(s).<br/>`<-geoCoordinates`: Search for nodes based on latitude and/or longitude. <br/>If not specified, the default is `<-description`. <br/>Each expression must end with `->dcid` and may optionally include a [`typeOf` filter](/api/rest/v2/index.html#filters). <br/><b>Note:</b> To specify `wikidataId`,`geoCoordinates`, or a `typeOf` filter on the query, you must specify this parameter. <br/> Note: The `description` field is not necessarily present in the knowledge graph for all entities. It is a synthetic property that Data Commons uses to check various name-related fields, such as `name`. The `geoCoordinates` field is a synthesis of `latitude` and `longitude` properties. | | |
| | expression <br /> <optional-tag>Optional</optional-tag> | string | An expression that describes the identifier used in the `nodes` parameter. Only three are currently supported:<br />`<-description`: Search for nodes based on name-related properties (such as `name`, `alternateName`, etc.).<br/>`<-wikidataId`: Search for nodes based on their Wikidata ID(s). (for place resolution only) <br/>`<-geoCoordinates`: Search for nodes based on latitude and/or longitude. for place resolution only) <br/>If not specified, the default is `<-description`. <br/>Each expression must end with `->dcid` and may optionally include a [`typeOf` filter](/api/rest/v2/index.html#filters). <br/><b>Note:</b> To specify `wikidataId`,`geoCoordinates`, or a `typeOf` filter on the query, you must specify this parameter. <br/> Note: The `description` field is not necessarily present in the knowledge graph for all entities. It is a synthetic property that Data Commons uses to check various name-related fields, such as `name`. The `geoCoordinates` field is a synthesis of `latitude` and `longitude` properties. | |
|
|
||
| <div id="GET-request" class="api-tabcontent api-signature"> | ||
| https://api.datacommons.org/v2/resolve?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=<var>IDENTIFIER_LIST</var>&property=<var>EXPRESSION</var> | ||
| https://api.datacommons.org/v2/resolve?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=<var>IDENTIFIER_LIST</var>&resolver=<var>NODE_TYPE</var>&property=<var>EXPRESSION</var>&target=<var>INSTANCE</var> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| https://api.datacommons.org/v2/resolve?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=<var>IDENTIFIER_LIST</var>&resolver=<var>NODE_TYPE</var>&property=<var>EXPRESSION</var>&target=<var>INSTANCE</var> | |
| https://api.datacommons.org/v2/resolve?key=AIzaSyCTI4Xz-UW_G2Q2RfknhcfdAnTHq5X5XuI&nodes=<var>IDENTIFIER_LIST</var>&resolver=<var>RESOLUTION_TYPE</var>&property=<var>EXPRESSION</var>&target=<var>INSTANCE</var> |
| ... | ||
| ], | ||
| "property": "<var>EXPRESSION</var>" | ||
| "resolver": "<var>NODE_TYPE</var>", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "resolver": "<var>NODE_TYPE</var>", | |
| "resolver": "<var>RESOLUTION_TYPE</var>", |
| @@ -116,7 +168,10 @@ The response looks like: | |||
| |-------------|--------|-------------------------------------| | |||
| | node | string | The property value or description provided. | | |||
| | candidates | list | DCIDs matching the description you provided. | | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
candidates is a list of nodes. each node contains dcid, dominantType (optional), metadata (optional), typeOf (optional)
|
|
||
| (truncated) | ||
|
|
||
| ```json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ```json | |
| ```jsonc |
| "dcid": "Count_Person_18OrMoreYears", | ||
| "metadata": { | ||
| "score": "0.8167", | ||
| "sentence": "adult population count" | ||
| }, | ||
| "typeOf": [ | ||
| "StatisticalVariable" | ||
| ] | ||
| }, | ||
| { | ||
| "dcid": "Count_Person_Upto18Years", | ||
| "metadata": { | ||
| "score": "0.8121", | ||
| "sentence": "children population count" | ||
| }, | ||
| "typeOf": [ | ||
| "StatisticalVariable" | ||
| ] | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "dcid": "Count_Person_18OrMoreYears", | |
| "metadata": { | |
| "score": "0.8167", | |
| "sentence": "adult population count" | |
| }, | |
| "typeOf": [ | |
| "StatisticalVariable" | |
| ] | |
| }, | |
| { | |
| "dcid": "Count_Person_Upto18Years", | |
| "metadata": { | |
| "score": "0.8121", | |
| "sentence": "children population count" | |
| }, | |
| "typeOf": [ | |
| "StatisticalVariable" | |
| ] | |
| }, | |
| // ... | |
| ]}]} |
This PR also:
Staged at https://bullie.svl.corp.google.com:4000