From 7419ddc3fd45029f5f8770b4c561c1029edb6889 Mon Sep 17 00:00:00 2001 From: Philip Durbin Date: Fri, 30 May 2025 14:11:49 -0400 Subject: [PATCH 1/5] add AI Guide: Ask the Data, TurboCurator, Ask Datavere, DataChat #11540 --- doc/sphinx-guides/source/ai/index.md | 27 +++++++++++++++++++++++++++ doc/sphinx-guides/source/index.rst | 1 + 2 files changed, 28 insertions(+) create mode 100644 doc/sphinx-guides/source/ai/index.md diff --git a/doc/sphinx-guides/source/ai/index.md b/doc/sphinx-guides/source/ai/index.md new file mode 100644 index 00000000000..d278b486b8e --- /dev/null +++ b/doc/sphinx-guides/source/ai/index.md @@ -0,0 +1,27 @@ +# AI Guide + +Artificial Intelligence (AI) is a growing component of the Dataverse ecosystem. + +```{contents} Contents: +:local: +:depth: 2 +``` + +## Tools + +### Ask the Data + +Ask the Data is an {ref}`external tool ` that allows you ask natural language questions about the data contained in Dataverse tables (tabular data). See the README.md file at for the instructions on adding Ask the Data to your Dataverse installation. + +### TurboCurator + +TurboCurator is an {ref}`external tool ` that generates metadata improvements for title, description, and keywords. It relies on OpenAI's ChatGPT & ICPSR best practices. See the [TurboCurator Dataverse Administrator](https://turbocurator.icpsr.umich.edu/tc/adminabout/) page for more details on how it works and adding TurboCurator to your Dataverse installation. + +### Ask Dataverse + +Ask Dataverse ([ask.dataverse.org](https://ask.dataverse.org)) is a place to ask questions about the Dataverse Project and the Dataverse software. It was created by Slava Tykhonov who [announced](https://groups.google.com/g/dataverse-community/c/tqwCoygO4oE/m/MNSfrw_QAwAJ) it in December 2024 and presented it February 2025 ([video](https://harvard.zoom.us/rec/share/bOizatNdMdxINRCnqpt87fPITPvsDWTv3ysvA8kIaEE4wnmZPSeSUkdmpKYP1ooA.rKoNMqED_L8KtHOi), [slides](https://docs.google.com/presentation/d/1HFN-wAe4eUGwJAhYCLbNcNHAsi-Hy8jQ/edit?usp=sharing&ouid=117275479921759507378&rtpof=true&sd=true), [notes](https://docs.google.com/document/d/1Dz07WKceGrBGdq5wWf0NJS08CO0FEmi4TgQBcsDcpRE/edit?usp=sharing)). + +### DataChat + +DataChat is a multilingual open source natural language interface for Dataverse and other data platforms with an experimental Graph AI implementation for Croissant support. DataChat can literally talk back to you and explain what is inside of every single dataset, you can ask any question and it responds on the level of metadata described by Croissant standard. Learn more at . + diff --git a/doc/sphinx-guides/source/index.rst b/doc/sphinx-guides/source/index.rst index af394108eea..0b1d85718b2 100755 --- a/doc/sphinx-guides/source/index.rst +++ b/doc/sphinx-guides/source/index.rst @@ -15,6 +15,7 @@ These documentation guides are for the |version| version of Dataverse. To find g user/index admin/index + ai/index api/index installation/index contributor/index From 25004a90b3d9eb1e77eb707b227a69791136f9f7 Mon Sep 17 00:00:00 2001 From: Philip Durbin Date: Fri, 30 May 2025 14:17:17 -0400 Subject: [PATCH 2/5] add release note #11541 --- doc/release-notes/11540-ai-guide.md | 3 +++ 1 file changed, 3 insertions(+) create mode 100644 doc/release-notes/11540-ai-guide.md diff --git a/doc/release-notes/11540-ai-guide.md b/doc/release-notes/11540-ai-guide.md new file mode 100644 index 00000000000..e74606b9016 --- /dev/null +++ b/doc/release-notes/11540-ai-guide.md @@ -0,0 +1,3 @@ +### AI Guide for Dataverse + +Information about various Dataverse-related AI efforts have been documented in a new [AI Guide](https://dataverse-guide--11541.org.readthedocs.build/en/11541/ai/index.html). See #11540 and #11541. From 7c718c48777c6152721ddea1bd4285b077ed7b19 Mon Sep 17 00:00:00 2001 From: Philip Durbin Date: Fri, 30 May 2025 14:41:55 -0400 Subject: [PATCH 3/5] add docs for MCP server #11474 #11540 --- doc/release-notes/11540-ai-guide.md | 2 +- doc/sphinx-guides/source/ai/index.md | 10 ++++++++++ doc/sphinx-guides/source/api/apps.rst | 5 +++++ 3 files changed, 16 insertions(+), 1 deletion(-) diff --git a/doc/release-notes/11540-ai-guide.md b/doc/release-notes/11540-ai-guide.md index e74606b9016..dc4a0fafe7f 100644 --- a/doc/release-notes/11540-ai-guide.md +++ b/doc/release-notes/11540-ai-guide.md @@ -1,3 +1,3 @@ ### AI Guide for Dataverse -Information about various Dataverse-related AI efforts have been documented in a new [AI Guide](https://dataverse-guide--11541.org.readthedocs.build/en/11541/ai/index.html). See #11540 and #11541. +Information about various Dataverse-related AI efforts have been documented in a new [AI Guide](https://dataverse-guide--11541.org.readthedocs.build/en/11541/ai/index.html). See #11474, #11540, and #11541. diff --git a/doc/sphinx-guides/source/ai/index.md b/doc/sphinx-guides/source/ai/index.md index d278b486b8e..b0ad97e8408 100644 --- a/doc/sphinx-guides/source/ai/index.md +++ b/doc/sphinx-guides/source/ai/index.md @@ -25,3 +25,13 @@ Ask Dataverse ([ask.dataverse.org](https://ask.dataverse.org)) is a place to ask DataChat is a multilingual open source natural language interface for Dataverse and other data platforms with an experimental Graph AI implementation for Croissant support. DataChat can literally talk back to you and explain what is inside of every single dataset, you can ask any question and it responds on the level of metadata described by Croissant standard. Learn more at . +## Protocols + +(mcp)= +### Model Context Protocol (MCP) + +[Model Context Protocol (MCP)](https://modelcontextprotocol.io/introduction) is a standard for AI Agents to communicate with tools and services, [announced](https://www.anthropic.com/news/model-context-protocol) in November 2024. + +An MCP server for Dataverse has been deployed to [mcp.dataverse.org][], powered by the code at . See the code's README for information on configuring MCP clients (e.g. Cursor, Visual Studio Code, Windsurf, Zed, etc.) to use [mcp.dataverse.org][] or your own local installation (setup instructions are also provided). + +[mcp.dataverse.org]: https://mcp.dataverse.org diff --git a/doc/sphinx-guides/source/api/apps.rst b/doc/sphinx-guides/source/api/apps.rst index c5c761f7dfc..3e774a3799d 100755 --- a/doc/sphinx-guides/source/api/apps.rst +++ b/doc/sphinx-guides/source/api/apps.rst @@ -101,6 +101,11 @@ This module can, among others, help you migrate one dataverse to another. (see ` https://github.com/iza-institute-of-labor-economics/idsc.dataverse +mcp-dataverse +~~~~~~~~~~~~~ + +The code at https://github.com/gdcc/mcp-dataverse powers a :ref:`mcp` server for Dataverse. + Java ---- From 0f6ad6dbaad8e575580dd08c290360d77bb3ced4 Mon Sep 17 00:00:00 2001 From: Philip Durbin Date: Mon, 2 Jun 2025 09:28:32 -0400 Subject: [PATCH 4/5] add AutoSage #11540 --- doc/sphinx-guides/source/ai/index.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/doc/sphinx-guides/source/ai/index.md b/doc/sphinx-guides/source/ai/index.md index b0ad97e8408..8e1d7d0417d 100644 --- a/doc/sphinx-guides/source/ai/index.md +++ b/doc/sphinx-guides/source/ai/index.md @@ -13,6 +13,10 @@ Artificial Intelligence (AI) is a growing component of the Dataverse ecosystem. Ask the Data is an {ref}`external tool ` that allows you ask natural language questions about the data contained in Dataverse tables (tabular data). See the README.md file at for the instructions on adding Ask the Data to your Dataverse installation. +### AutoSage + +AutoSage provides metadata suggestions for datasets. Learn more at . + ### TurboCurator TurboCurator is an {ref}`external tool ` that generates metadata improvements for title, description, and keywords. It relies on OpenAI's ChatGPT & ICPSR best practices. See the [TurboCurator Dataverse Administrator](https://turbocurator.icpsr.umich.edu/tc/adminabout/) page for more details on how it works and adding TurboCurator to your Dataverse installation. From a3461701caf6ed4f85805b4e7d2c3756f66a8d22 Mon Sep 17 00:00:00 2001 From: Philip Durbin Date: Mon, 2 Jun 2025 09:29:48 -0400 Subject: [PATCH 5/5] sort a-z #11540 --- doc/sphinx-guides/source/ai/index.md | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/doc/sphinx-guides/source/ai/index.md b/doc/sphinx-guides/source/ai/index.md index 8e1d7d0417d..18d9cec29f2 100644 --- a/doc/sphinx-guides/source/ai/index.md +++ b/doc/sphinx-guides/source/ai/index.md @@ -9,6 +9,10 @@ Artificial Intelligence (AI) is a growing component of the Dataverse ecosystem. ## Tools +### Ask Dataverse + +Ask Dataverse ([ask.dataverse.org](https://ask.dataverse.org)) is a place to ask questions about the Dataverse Project and the Dataverse software. It was created by Slava Tykhonov who [announced](https://groups.google.com/g/dataverse-community/c/tqwCoygO4oE/m/MNSfrw_QAwAJ) it in December 2024 and presented it February 2025 ([video](https://harvard.zoom.us/rec/share/bOizatNdMdxINRCnqpt87fPITPvsDWTv3ysvA8kIaEE4wnmZPSeSUkdmpKYP1ooA.rKoNMqED_L8KtHOi), [slides](https://docs.google.com/presentation/d/1HFN-wAe4eUGwJAhYCLbNcNHAsi-Hy8jQ/edit?usp=sharing&ouid=117275479921759507378&rtpof=true&sd=true), [notes](https://docs.google.com/document/d/1Dz07WKceGrBGdq5wWf0NJS08CO0FEmi4TgQBcsDcpRE/edit?usp=sharing)). + ### Ask the Data Ask the Data is an {ref}`external tool ` that allows you ask natural language questions about the data contained in Dataverse tables (tabular data). See the README.md file at for the instructions on adding Ask the Data to your Dataverse installation. @@ -17,17 +21,14 @@ Ask the Data is an {ref}`external tool ` that allow AutoSage provides metadata suggestions for datasets. Learn more at . -### TurboCurator - -TurboCurator is an {ref}`external tool ` that generates metadata improvements for title, description, and keywords. It relies on OpenAI's ChatGPT & ICPSR best practices. See the [TurboCurator Dataverse Administrator](https://turbocurator.icpsr.umich.edu/tc/adminabout/) page for more details on how it works and adding TurboCurator to your Dataverse installation. +### DataChat -### Ask Dataverse +DataChat is a multilingual open source natural language interface for Dataverse and other data platforms with an experimental Graph AI implementation for Croissant support. DataChat can literally talk back to you and explain what is inside of every single dataset, you can ask any question and it responds on the level of metadata described by Croissant standard. Learn more at . -Ask Dataverse ([ask.dataverse.org](https://ask.dataverse.org)) is a place to ask questions about the Dataverse Project and the Dataverse software. It was created by Slava Tykhonov who [announced](https://groups.google.com/g/dataverse-community/c/tqwCoygO4oE/m/MNSfrw_QAwAJ) it in December 2024 and presented it February 2025 ([video](https://harvard.zoom.us/rec/share/bOizatNdMdxINRCnqpt87fPITPvsDWTv3ysvA8kIaEE4wnmZPSeSUkdmpKYP1ooA.rKoNMqED_L8KtHOi), [slides](https://docs.google.com/presentation/d/1HFN-wAe4eUGwJAhYCLbNcNHAsi-Hy8jQ/edit?usp=sharing&ouid=117275479921759507378&rtpof=true&sd=true), [notes](https://docs.google.com/document/d/1Dz07WKceGrBGdq5wWf0NJS08CO0FEmi4TgQBcsDcpRE/edit?usp=sharing)). +### TurboCurator -### DataChat +TurboCurator is an {ref}`external tool ` that generates metadata improvements for title, description, and keywords. It relies on OpenAI's ChatGPT & ICPSR best practices. See the [TurboCurator Dataverse Administrator](https://turbocurator.icpsr.umich.edu/tc/adminabout/) page for more details on how it works and adding TurboCurator to your Dataverse installation. -DataChat is a multilingual open source natural language interface for Dataverse and other data platforms with an experimental Graph AI implementation for Croissant support. DataChat can literally talk back to you and explain what is inside of every single dataset, you can ask any question and it responds on the level of metadata described by Croissant standard. Learn more at . ## Protocols