Skip to content

Commit 438b8e3

Browse files
authored
tests: clean astra between notebook tests (#326)
1 parent 5976e1c commit 438b8e3

19 files changed

+177
-41
lines changed

.github/actions/lint/action.yml

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,14 @@ runs:
1515
run: |
1616
tox -c ragstack-e2e-tests -e lint
1717
tox -e lint-yaml
18+
19+
- name: Notebook
20+
shell: bash
21+
run: |
22+
python scripts/format-example-notebooks.py
23+
if [ -n "$(git status --porcelain)" ]; then
24+
echo "Notebooks are not formatted"
25+
echo "Please run 'python scripts/format-example-notebooks.py' and commit the changes."
26+
git status
27+
exit 1
28+
fi

.github/workflows/ci-e2e-tests.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,13 @@ jobs:
2727
astradb-prod-region: "us-east-1"
2828
astradb-prod-cloud: "aws"
2929
is-scheduled: ${{ github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' }}
30+
# yamllint disable-line rule:line-length
3031
is-ragstack-dev-cron: ${{ github.event_name == 'workflow_dispatch' || (github.event_name == 'schedule' && github.event.schedule == '0 0/4 * * *') }}
32+
# yamllint disable-line rule:line-length
3133
is-ragstack-latest-release-cron: ${{ github.event_name == 'workflow_dispatch' || (github.event_name == 'schedule' && github.event.schedule == '0 1/4 * * *') }}
34+
# yamllint disable-line rule:line-length
3235
is-langchain-dev-cron: ${{ github.event_name == 'workflow_dispatch' || (github.event_name == 'schedule' && github.event.schedule == '0 2/4 * * *') }}
36+
# yamllint disable-line rule:line-length
3337
is-llamaindex-dev-cron: ${{ github.event_name == 'workflow_dispatch' || (github.event_name == 'schedule' && github.event.schedule == '0 3/4 * * *') }}
3438
steps:
3539
- uses: actions/checkout@v4

examples/notebooks/FLARE.ipynb

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,14 @@
5353
{
5454
"cell_type": "code",
5555
"execution_count": null,
56-
"metadata": {},
56+
"metadata": {
57+
"nbmake": {
58+
"post_cell_execute": [
59+
"from conftest import before_notebook",
60+
"before_notebook()"
61+
]
62+
}
63+
},
5764
"outputs": [],
5865
"source": [
5966
"! pip install ragstack-ai"
@@ -699,4 +706,4 @@
699706
},
700707
"nbformat": 4,
701708
"nbformat_minor": 4
702-
}
709+
}

examples/notebooks/QA_with_cassio.ipynb

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,11 +19,11 @@
1919
"This notebook guides you through setting up [RAGStack](https://www.datastax.com/products/ragstack) using [Astra Vector Search](https://docs.datastax.com/en/astra-serverless/docs/vector-search/overview.html), [OpenAI](https://openai.com/about), and [CassIO](https://cassio.org/) to implement a generative Q&A over your own documentation.\n",
2020
"\n",
2121
"## Astra Vector Search\n",
22-
"Astra vector search enables developers to search a database by context or meaning rather than keywords or literal values. This is done by using “embeddings”. Embeddings are a type of representation used in machine learning where high-dimensional or complex data is mapped onto vectors in a lower-dimensional space. These vectors capture the semantic properties of the input data, meaning that similar data points have similar embeddings.\n",
22+
"Astra vector search enables developers to search a database by context or meaning rather than keywords or literal values. This is done by using \u201cembeddings\u201d. Embeddings are a type of representation used in machine learning where high-dimensional or complex data is mapped onto vectors in a lower-dimensional space. These vectors capture the semantic properties of the input data, meaning that similar data points have similar embeddings.\n",
2323
"Reference: [Astra Vector Search](https://docs.datastax.com/en/astra-serverless/docs/vector-search/overview.html)\n",
2424
"\n",
2525
"## CassIO\n",
26-
"CassIO is the ultimate solution for seamlessly integrating Apache Cassandra® with generative artificial intelligence and other machine learning workloads. This powerful Python library simplifies the complicated process of accessing the advanced features of the Cassandra database, including vector search capabilities. With CassIO, developers can fully concentrate on designing and perfecting their AI systems without any concerns regarding the complexities of integration with Cassandra.\n",
26+
"CassIO is the ultimate solution for seamlessly integrating Apache Cassandra\u00ae with generative artificial intelligence and other machine learning workloads. This powerful Python library simplifies the complicated process of accessing the advanced features of the Cassandra database, including vector search capabilities. With CassIO, developers can fully concentrate on designing and perfecting their AI systems without any concerns regarding the complexities of integration with Cassandra.\n",
2727
"Reference: [CassIO](https://cassio.org/)\n",
2828
"\n",
2929
"## OpenAI\n",
@@ -72,6 +72,12 @@
7272
},
7373
"editable": true,
7474
"id": "a6d88d66",
75+
"nbmake": {
76+
"post_cell_execute": [
77+
"from conftest import before_notebook",
78+
"before_notebook()"
79+
]
80+
},
7581
"outputId": "dc543d17-3fb2-4362-cc4e-0050bd7787ba",
7682
"slideshow": {
7783
"slide_type": ""
@@ -543,4 +549,4 @@
543549
},
544550
"nbformat": 4,
545551
"nbformat_minor": 5
546-
}
552+
}

examples/notebooks/RAG_with_cassio.ipynb

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,12 @@
6767
"base_uri": "https://localhost:8080/"
6868
},
6969
"id": "2953d95b",
70+
"nbmake": {
71+
"post_cell_execute": [
72+
"from conftest import before_notebook",
73+
"before_notebook()"
74+
]
75+
},
7076
"outputId": "f1d1a4dc-9984-405d-bf28-74697815ea12"
7177
},
7278
"outputs": [],
@@ -472,4 +478,4 @@
472478
},
473479
"nbformat": 4,
474480
"nbformat_minor": 5
475-
}
481+
}

examples/notebooks/__init__.py

Whitespace-only changes.

examples/notebooks/advancedRAG.ipynb

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,14 @@
5555
{
5656
"cell_type": "code",
5757
"execution_count": null,
58-
"metadata": {},
58+
"metadata": {
59+
"nbmake": {
60+
"post_cell_execute": [
61+
"from conftest import before_notebook",
62+
"before_notebook()"
63+
]
64+
}
65+
},
5966
"outputs": [],
6067
"source": [
6168
"%pip install ragstack-ai"
@@ -1035,4 +1042,4 @@
10351042
},
10361043
"nbformat": 4,
10371044
"nbformat_minor": 4
1038-
}
1045+
}

examples/notebooks/astradb.ipynb

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
"source": [
1414
"# Astra DB\n",
1515
"\n",
16-
"This page provides a quickstart for using [Astra DB](https://docs.datastax.com/en/astra/home/astra.html) and [Apache Cassandra®](https://cassandra.apache.org/) as a Vector Store.\n",
16+
"This page provides a quickstart for using [Astra DB](https://docs.datastax.com/en/astra/home/astra.html) and [Apache Cassandra\u00ae](https://cassandra.apache.org/) as a Vector Store.\n",
1717
"\n",
1818
"_Note: in addition to access to the database, an OpenAI API Key is required to run the full example._"
1919
]
@@ -39,7 +39,14 @@
3939
"cell_type": "code",
4040
"execution_count": null,
4141
"id": "8d00fcf4-9798-4289-9214-d9734690adfc",
42-
"metadata": {},
42+
"metadata": {
43+
"nbmake": {
44+
"post_cell_execute": [
45+
"from conftest import before_notebook",
46+
"before_notebook()"
47+
]
48+
}
49+
},
4350
"outputs": [],
4451
"source": [
4552
"!pip install --quiet datasets pypdf"
@@ -902,4 +909,4 @@
902909
},
903910
"nbformat": 4,
904911
"nbformat_minor": 5
905-
}
912+
}

examples/notebooks/conftest.py

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ def get_required_env(name) -> str:
2828
)
2929

3030

31-
def try_delete_with_backoff(collection: str, sleep=1, max_tries=5):
31+
def try_delete_with_backoff(collection: str, sleep=1, max_tries=2):
3232
try:
3333
logging.info(f"deleting collection {collection}")
3434
response = client.delete_collection(
@@ -45,10 +45,8 @@ def try_delete_with_backoff(collection: str, sleep=1, max_tries=5):
4545
try_delete_with_backoff(collection, sleep * 2, max_tries)
4646

4747

48-
@pytest.fixture
49-
def cleanup_astra():
48+
def before_notebook():
5049
collections = client.get_collections().get("status").get("collections")
5150
logging.info(f"Existing collections: {collections}")
5251
for collection in collections:
5352
try_delete_with_backoff(collection)
54-
yield

0 commit comments

Comments
 (0)