From 3fe2e0c8085c7c1573a9860c5d160d179ba699f9 Mon Sep 17 00:00:00 2001 From: Nithin James Date: Sat, 16 Aug 2025 18:03:51 +0300 Subject: [PATCH 01/10] issue template and pull request template --- .github/ISSUE_TEMPLATE/content_changes.yml | 72 ++++++++++++++++++++++ .github/PULL_REQUEST_TEMPLATE.md | 41 ++++++++++++ CONTRIBUTING.md | 2 +- 3 files changed, 114 insertions(+), 1 deletion(-) create mode 100644 .github/ISSUE_TEMPLATE/content_changes.yml create mode 100644 .github/PULL_REQUEST_TEMPLATE.md diff --git a/.github/ISSUE_TEMPLATE/content_changes.yml b/.github/ISSUE_TEMPLATE/content_changes.yml new file mode 100644 index 0000000..6ba86ec --- /dev/null +++ b/.github/ISSUE_TEMPLATE/content_changes.yml @@ -0,0 +1,72 @@ +name: Documentation Update +description: Suggest additions, modifications, or improvements to the documentation +title: "[DOCS] " +labels: ["docs", "enhancement"] +assignees: [] + +body: + - type: markdown + attributes: + value: | + Thanks for helping us improve our documentation! + + - type: dropdown + id: change-type + attributes: + label: Type of Change + description: What kind of update are you suggesting? + options: + - Addition (new content) + - Modification (updates to existing content) + - Removal (outdated or redundant content) + - Other + validations: + required: true + + - type: textarea + id: proposed-content + attributes: + label: Proposed Content / Change + description: Describe the content you'd like to add, modify, or remove + placeholder: I think the documentation should include/update/remove... + validations: + required: true + + - type: textarea + id: location + attributes: + label: Location + description: Where should this content be placed or updated in the documentation structure? + placeholder: This should be added/updated in the section on... + validations: + required: true + + - type: textarea + id: rationale + attributes: + label: Rationale + description: Why is this change valuable for the project documentation? + placeholder: This content change would be valuable because... + validations: + required: true + + - type: textarea + id: content-outline + attributes: + label: Suggested Outline (Optional) + description: If you have ideas for how the content should be structured, provide an outline + placeholder: | + 1. Introduction + 2. Key concepts + 3. Examples + validations: + required: false + + - type: textarea + id: references + attributes: + label: References + description: Include links to any reference material or examples + placeholder: Related resources or examples + validations: + required: false diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md new file mode 100644 index 0000000..b0080cb --- /dev/null +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -0,0 +1,41 @@ + + +## Description + + +## Type of Change + +- [ ] 📄 New content addition +- [ ] ✏️ Content update/revision +- [ ] 📚 Structure/organization improvement +- [ ] 🔤 Typo/formatting fix +- [ ] 🐛 Bug fix +- [ ] 🔧 Tooling/config change (docs build, CI/CD, etc.) +- [ ] Other (please describe): + +## Motivation and Context + + +## Areas Affected + +- e.g., `docs/getting-started.md`, `docs/configuration/` + +## Screenshots (if applicable) + + +## Checklist + +- [ ] I have read the **CONTRIBUTING** guidelines +- [ ] My changes follow the project’s documentation style guide +- [ ] I have previewed my changes locally (`mkdocs serve` or equivalent) +- [ ] All internal/external links are valid +- [ ] Images/diagrams are optimized (size, format) and display correctly +- [ ] Any new references/resources are cited appropriately +- [ ] All existing checks/tests pass (if applicable) + +## Additional Notes + + +--- + +By submitting this pull request, I confirm that my contribution can be used, modified, and redistributed under the terms of this project’s license. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 8c41c75..46ca028 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -23,7 +23,7 @@ reported the issue. Please try to include as much information as you can. Detail Looking at the existing issues is a great way to find something to contribute to. We label issues that are well-defined and ready for community contributions with the "ready for contribution" label. Check our "Ready for Contribution" issues for items you can work on: -- [SDK Python Issues](https://github.com/Datuanalytics/datu-core/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22ready%20for%20contribution%22) +- [Datu Core Issues](https://github.com/Datuanalytics/datu-core/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22ready%20for%20contribution%22) Before starting work on any issue: 1. Check if someone is already assigned or working on it From 47a13464c9b343863ea48882e056636583a6419b Mon Sep 17 00:00:00 2001 From: Nithin James Date: Sat, 16 Aug 2025 18:04:47 +0300 Subject: [PATCH 02/10] use datu core version --- requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/requirements.txt b/requirements.txt index 7fa26d2..a393f05 100644 --- a/requirements.txt +++ b/requirements.txt @@ -4,4 +4,4 @@ mkdocs-macros-plugin~=1.3.7 mkdocs-material~=9.6.12 mkdocstrings-python~=1.16.10 mkdocs-llmstxt~=0.2.0 -git+https://github.com/Datuanalytics/datu-core@main \ No newline at end of file +datu-core~=0.1.0 \ No newline at end of file From 60f036e37962d7b3bc7e79ca92a0409171df291e Mon Sep 17 00:00:00 2001 From: Nithin James Date: Sat, 16 Aug 2025 18:27:53 +0300 Subject: [PATCH 03/10] push and build tests --- .github/workflows/build-pages.yml | 37 +++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) create mode 100644 .github/workflows/build-pages.yml diff --git a/.github/workflows/build-pages.yml b/.github/workflows/build-pages.yml new file mode 100644 index 0000000..16db28f --- /dev/null +++ b/.github/workflows/build-pages.yml @@ -0,0 +1,37 @@ +name: Build GitHub Pages +on: + push: + branches: + - main + repository_dispatch: + types: + - sdk-push +permissions: + contents: write +jobs: + deploy: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + with: + fetch-depth: 0 + token: ${{ secrets.GITHUB_TOKEN }} + - name: Configure Git Credentials + run: | + git config user.name github-actions[bot] + git config user.email 41898282+github-actions[bot]@users.noreply.github.com + - uses: actions/setup-python@v5 + with: + python-version: 3.x + - run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV + - uses: actions/cache@v4 + with: + key: mkdocs-material-${{ env.cache_id }} + path: .cache + restore-keys: | + mkdocs-material- + - run: | + pip install -r requirements.txt + - run: | + mike deploy --push --update-aliases 0.1.x latest + mike set-default --push 0.1.x \ No newline at end of file From daf0486eb024cf80683f689f523e93767525ac97 Mon Sep 17 00:00:00 2001 From: Nithin James Date: Sat, 16 Aug 2025 18:29:26 +0300 Subject: [PATCH 04/10] leech ignore --- .../{build-pages.yml => build-deploy.yml} | 3 -- .github/workflows/build-test.yml | 44 +++++++++++++++++++ .lycheeignore | 6 +++ 3 files changed, 50 insertions(+), 3 deletions(-) rename .github/workflows/{build-pages.yml => build-deploy.yml} (94%) create mode 100644 .github/workflows/build-test.yml create mode 100644 .lycheeignore diff --git a/.github/workflows/build-pages.yml b/.github/workflows/build-deploy.yml similarity index 94% rename from .github/workflows/build-pages.yml rename to .github/workflows/build-deploy.yml index 16db28f..9530533 100644 --- a/.github/workflows/build-pages.yml +++ b/.github/workflows/build-deploy.yml @@ -3,9 +3,6 @@ on: push: branches: - main - repository_dispatch: - types: - - sdk-push permissions: contents: write jobs: diff --git a/.github/workflows/build-test.yml b/.github/workflows/build-test.yml new file mode 100644 index 0000000..7df11ef --- /dev/null +++ b/.github/workflows/build-test.yml @@ -0,0 +1,44 @@ +name: Build and Test + +on: + pull_request: + branches: [ main ] + types: [ opened, synchronize, reopened, ready_for_review, review_requested, review_request_removed ] + + schedule: + - cron: "00 10 * * *" # Run at 10:00 AM every day + +permissions: read-all + +jobs: + build: + name: Build Documentation + runs-on: ubuntu-latest + steps: + - name: Checkout code + uses: actions/checkout@v4 + - name: Install dependencies + run: | + pip install -r requirements.txt + - name: Build docs + run: | + mkdocs build + + check_links: + name: Check Links + runs-on: ubuntu-latest + + steps: + - uses: actions/checkout@v4 + - name: Restore lychee cache + uses: actions/cache@v4 + with: + path: .lycheecache + key: cache-lychee-${{ github.sha }} + restore-keys: cache-lychee- + + - name: Check links with lychee + id: lychee + uses: lycheeverse/lychee-action@v2 + with: + args: "--base . --cache --max-cache-age 1d ." \ No newline at end of file diff --git a/.lycheeignore b/.lycheeignore new file mode 100644 index 0000000..bee577d --- /dev/null +++ b/.lycheeignore @@ -0,0 +1,6 @@ +# see examples in https://github.com/opensafely/documentation/blob/main/.lycheeignore + +# localhost +https?://locahost.* +https?://127\.0\.0\.1.* +.*localhost.* \ No newline at end of file From 2a53d8bf8ed7e51814a7794365c570e4008ef633 Mon Sep 17 00:00:00 2001 From: Nithin James Date: Sat, 16 Aug 2025 18:30:39 +0300 Subject: [PATCH 05/10] comment datu core --- requirements.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/requirements.txt b/requirements.txt index a393f05..6465f08 100644 --- a/requirements.txt +++ b/requirements.txt @@ -4,4 +4,4 @@ mkdocs-macros-plugin~=1.3.7 mkdocs-material~=9.6.12 mkdocstrings-python~=1.16.10 mkdocs-llmstxt~=0.2.0 -datu-core~=0.1.0 \ No newline at end of file +#datu-core~=0.1.0 \ No newline at end of file From effbac6bf7492916cf26472e8fea6c8798efe6e0 Mon Sep 17 00:00:00 2001 From: Nithin James Date: Sat, 16 Aug 2025 18:32:26 +0300 Subject: [PATCH 06/10] include site url --- mkdocs.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/mkdocs.yml b/mkdocs.yml index 9bf64d5..333d27d 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,6 +1,7 @@ site_name: Datu AI Analyst site_description: Datu AI Analyst is a Python SDK for building AI agents that can interact with data, perform analysis, and generate insights. It provides a framework for creating agents that can work with various data sources and tools, enabling developers to build intelligent applications that leverage AI capabilities. site_dir: site +site_url: https://docs.datu.fi repo_url: https://github.com/Datuanalytics/datu-core From 9bd5332f95032069a5f4abbeabe542344166b963 Mon Sep 17 00:00:00 2001 From: Nithin James Date: Sat, 16 Aug 2025 18:37:23 +0300 Subject: [PATCH 07/10] ignore issue link for now --- .lycheeignore | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/.lycheeignore b/.lycheeignore index bee577d..8935025 100644 --- a/.lycheeignore +++ b/.lycheeignore @@ -3,4 +3,5 @@ # localhost https?://locahost.* https?://127\.0\.0\.1.* -.*localhost.* \ No newline at end of file +.*localhost.* +https://github.com/Datuanalytics/datu-core/issues* \ No newline at end of file From 2aef2fe97039b61b4b42240458c0926eb6b212db Mon Sep 17 00:00:00 2001 From: Nithin James Date: Sun, 17 Aug 2025 08:41:41 +0300 Subject: [PATCH 08/10] start from 0.0.1 --- .github/workflows/build-deploy.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/build-deploy.yml b/.github/workflows/build-deploy.yml index 9530533..cc98191 100644 --- a/.github/workflows/build-deploy.yml +++ b/.github/workflows/build-deploy.yml @@ -30,5 +30,5 @@ jobs: - run: | pip install -r requirements.txt - run: | - mike deploy --push --update-aliases 0.1.x latest - mike set-default --push 0.1.x \ No newline at end of file + mike deploy --push --update-aliases 0.0.x latest + mike set-default --push 0.0.x \ No newline at end of file From 016f2c724dea73970cf292840c48202f0d94c176 Mon Sep 17 00:00:00 2001 From: Nithin James Date: Sun, 17 Aug 2025 09:46:09 +0300 Subject: [PATCH 09/10] datasource addition --- docs/user-guide/datasources/datasources.md | 63 ++++++++++++++++++++++ docs/user-guide/datasources/postgres.md | 20 +++++++ docs/user-guide/datasources/sqlserver.md | 25 +++++++++ 3 files changed, 108 insertions(+) create mode 100644 docs/user-guide/datasources/datasources.md create mode 100644 docs/user-guide/datasources/postgres.md create mode 100644 docs/user-guide/datasources/sqlserver.md diff --git a/docs/user-guide/datasources/datasources.md b/docs/user-guide/datasources/datasources.md new file mode 100644 index 0000000..b431772 --- /dev/null +++ b/docs/user-guide/datasources/datasources.md @@ -0,0 +1,63 @@ +# Datasources Overview + +The directory provides a collection of configurations to help you get started with connecting data sources with Datu. +## Purpose + +With Datu, you can quickly connect to your data sources and turn raw information into actionable insights. + +### How to add datasources + +As per the current design the application will fetch all the schema that is listed in the profiles.yml. It is to avoid fetching the schema every single time.But it will only work on the **target** datasource that is selected. + +**Structure of profiles.yml** + +```sh +datu_demo: + target: dev-postgres # Target is used to select the datasource that is currently active. Change this if you would like to use a different datasource. + outputs: + dev-postgres: + type: postgres + {% raw %} + host: "{{ env_var('DB_HOST', 'localhost') }}" # if a environment variable is supplied that gets priority. This is useful for not hardcoding. + {% endraw %} + port: 5432 + user: postgres + password: postgres + dbname: my_sap_bronze + schema: bronze + dev-sqlserver: + type: sqlserver + driver: 'ODBC Driver 18 for SQL Server' # Mandatory for sqlserver. + host: localhost + port: 1433 + user: sa + password: Password123! + dbname: my_sap_bronze + schema: bronze +``` + +### About profiles.yml + +Datu core needs profiles.yml file that contains all the datasources configured. If you have used [dbt](https://github.com/dbt-labs/dbt-core),this is somewhat like to their profiles.yml though not exaclty the same. + +```sh +: + target: # this is the default target + outputs: + : + type: + schema: + + ### Look for each datasources specific variables + ... + + ... + +: # additional profiles + ... + +``` + +### env_var + +You can use `env_var` with any attribute in the `profiles.yml` `outputs` section to load configuration values from environment variables. diff --git a/docs/user-guide/datasources/postgres.md b/docs/user-guide/datasources/postgres.md new file mode 100644 index 0000000..585b66c --- /dev/null +++ b/docs/user-guide/datasources/postgres.md @@ -0,0 +1,20 @@ +### Postgres as a datasource + +Install datucore with extras postgres + +```sh +pip install "datu-core[postgres]" +``` + +In profiles.yml + +```sh +dev-postgres: + type: postgres + host: [hostname] + user: [username] + password: [password] + port: [port] + dbname: [database name] + schema: [schema] +``` \ No newline at end of file diff --git a/docs/user-guide/datasources/sqlserver.md b/docs/user-guide/datasources/sqlserver.md new file mode 100644 index 0000000..b075cfb --- /dev/null +++ b/docs/user-guide/datasources/sqlserver.md @@ -0,0 +1,25 @@ +### Sqlserver as a datasource + +Install datucore with extras postgres + +```sh +pip install "datu-core[sqldb]" +``` + +For sqlserver to work you have to make sure the below ODBC driver is installed on your machine according to the Operating System. + +[Install ODBC driver](https://learn.microsoft.com/en-us/sql/connect/python/pyodbc/step-1-configure-development-environment-for-pyodbc-python-development?view=sql-server-ver16&tabs=windows) + +In profiles.yml + +```sh +dev-sqlserver: + type: sqlserver + driver: 'ODBC Driver 18 for SQL Server' # Mandatory for sqlserver. + host: [hostname] + user: [username] + password: [password] + port: [port] + dbname: [database name] + schema: [schema] +``` \ No newline at end of file From ef6dbd45f6eb04ec6827fb67a7818203625a3421 Mon Sep 17 00:00:00 2001 From: Nithin James Date: Sun, 17 Aug 2025 09:46:31 +0300 Subject: [PATCH 10/10] fixing minor documentation issues and removing examples --- docs/README.md | 2 +- docs/examples/README.md | 13 ------------- .../deploy/deploy_as_container_service.md | 2 +- docs/user-guide/quickstart.md | 11 +++++++++-- mkdocs.yml | 10 ++++++---- 5 files changed, 17 insertions(+), 21 deletions(-) delete mode 100644 docs/examples/README.md diff --git a/docs/README.md b/docs/README.md index 4b32ba9..b750800 100644 --- a/docs/README.md +++ b/docs/README.md @@ -21,7 +21,7 @@ Then follow next steps. Ready to learn more? Check out these resources: - [Quickstart](user-guide/quickstart.md) - A more detailed introduction to Datu core -- [Examples](examples/README.md) - Examples for connecting multiple datasources. +- [Datasources](user-guide/datasources/datasources.md) - Connecting multiple datasources. [Learn how to contribute]({{ server_repo }}/CONTRIBUTING.md) or join our community discussions to shape the future of Datu ❤️. diff --git a/docs/examples/README.md b/docs/examples/README.md deleted file mode 100644 index 3879843..0000000 --- a/docs/examples/README.md +++ /dev/null @@ -1,13 +0,0 @@ -# Examples Overview - -The examples directory provides a collection of sample implementations to help you get started with connecting data sources with Datu. -## Purpose - -With Datu, you can quickly connect to your data sources and turn raw information into actionable insights. The sample projects cover everything from running straightforward queries to managing advanced, multi-step analysis pipelines, giving you a clear view of how Datu works in practice. - -Each example is designed to highlight proven techniques and practical workflows you can apply to your own analytics tasks. Whether you’re streamlining reports, exploring trends, or building complex data processes, these references show how Datu can be adapted to fit your specific goals. -## Prerequisites - -- Python 3.11 or higher -- For specific examples, additional requirements may be needed (see individual example READMEs) - diff --git a/docs/user-guide/deploy/deploy_as_container_service.md b/docs/user-guide/deploy/deploy_as_container_service.md index 58bf8ce..0e328a9 100644 --- a/docs/user-guide/deploy/deploy_as_container_service.md +++ b/docs/user-guide/deploy/deploy_as_container_service.md @@ -7,7 +7,7 @@ Use below recommended method to run Datu application as container service. To deploy your Datu, you need to containerize it using Podman or Docker. The Dockerfile defines how your application is packaged and run. Below is an example Docker file that installs all needed dependencies, the application, and configures the FastAPI server to run via unicorn dockerfile. ```sh -FROM python:3.10-slim +FROM python:3.11-slim SHELL ["/bin/bash", "-c"] RUN apt-get update && \ diff --git a/docs/user-guide/quickstart.md b/docs/user-guide/quickstart.md index a679065..f499519 100644 --- a/docs/user-guide/quickstart.md +++ b/docs/user-guide/quickstart.md @@ -59,7 +59,14 @@ my_sources: After creating the datasources profiles.yml. -**Environment variables**: Set `DATU_OPENAI_API_KEY` +### 🔧 Environment Variables + +set the following environment variables: + +- **`DATU_OPENAI_API_KEY`** – your OpenAI API key +- **`DATU_DBT_PROFILES`** – path to your `profiles.yml` + +Then run ```bash datu @@ -75,5 +82,5 @@ To enable debug logs in Datu server . Ready to learn more? Check out these resources: -- [Examples](../examples/README.md) - Examples for connecting multiple datasources. +- [Datasources](datasources/datasources.md) - Connecting multiple datasources. - [More configurations](configurations.md) - Datu server configurations includes port, schema configurations etc. \ No newline at end of file diff --git a/mkdocs.yml b/mkdocs.yml index 333d27d..bcf6610 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -59,13 +59,17 @@ nav: - User Guide: - Welcome: README.md - Quickstart: user-guide/quickstart.md + - Datasources: + - user-guide/datasources/datasources.md + - Postgres: user-guide/datasources/postgres.md + - Sqlserver: user-guide/datasources/sqlserver.md + - Configurations: + - user-guide/configurations.md - Deploy: - Container service: user-guide/deploy/deploy_as_container_service.md - Contribute ❤️: https://github.com/Datuanalytics/datu-core/blob/main/CONTRIBUTING.md - Architecture: - Overview: architecture/README.md - - Examples: - - Overview: examples/README.md - Contribute ❤️: https://github.com/Datuanalytics/datu-core/blob/main/CONTRIBUTING.md exclude_docs: | @@ -92,8 +96,6 @@ plugins: User Guide: - README.md - user-guide/**/*.md - Examples: - - examples/**/*.md extra: social: