Skip to content

Commit 1b107ce

Browse files
authored
Example notebooks: update link URLs from Google Colab to GitHub (#643)
1 parent 048b74f commit 1b107ce

File tree

7 files changed

+32
-32
lines changed

7 files changed

+32
-32
lines changed

api-reference/partition/post-requests.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; pic
1515
allowfullscreen
1616
></iframe>
1717

18-
Open the related [notebook](https://colab.research.google.com/drive/1rJOZYZfsTQ_JV2hXaY4kgYvbA7xEWBZn?usp=sharing) that is shown in the preceding video.
18+
Open the related [notebook](https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Unstructured_API_Partition_endpoint.ipynb) that is shown in the preceding video.
1919

2020
To make POST requests to the Unstructured Partition Endpoint, you will need:
2121

api-reference/workflow/overview.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; pic
5555
allowfullscreen
5656
></iframe>
5757

58-
Open a related [notebook](https://colab.research.google.com/drive/13f5C9WtUvIPjwJzxyOR3pNJ9K9vnF4ww?usp=sharing) that covers many of
58+
Open a related [notebook](https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Unstructured_Platform_Workflow_Endpoint_Quickstart.ipynb) that covers many of
5959
the concepts that are shown in the preceding videos.
6060

6161
The [Unstructured Python SDK](https://github.com/Unstructured-IO/unstructured-python-client), beginning with version 0.30.6,

examplecode/notebooks.mdx

Lines changed: 26 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -6,85 +6,85 @@ description: "Notebooks contain complete working sample code for end-to-end solu
66
---
77

88
<CardGroup cols={2}>
9-
<Card title="Getting Started with Unstructured API and IBM watsonx.data" href="https://colab.research.google.com/drive/1RB5ICOXNGo8xbxxcUkPHkSV8rlMKOLAn?usp=sharing">
9+
<Card title="Getting Started with Unstructured API and IBM watsonx.data" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Azure_to_IBM_WatsonX.ipynb">
1010
<br/>
1111
Learn how to create data processing workflows with Unstructured API and its Python SDK to preprocess all of your unstructured data from your Azure Blob Storage into your IBM watsonx.data instance.
1212
<br/>
1313
``Unstructured API`` ``Workflows`` ``Azure Blob Storage`` ``IBM watsonx.data``
1414
<br/>
1515
</Card>
16-
<Card title="Using Unstructured with Snowflake Cortex Search for RAG" href="https://colab.research.google.com/drive/17sWi10DoTNVEkYyEzc-ztEtIxBZHPczp?usp=sharing">
16+
<Card title="Using Unstructured with Snowflake Cortex Search for RAG" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Use_Unstructured_with_Snowflake_Cortex_for_RAG_Search.ipynb">
1717
<br/>
1818
Use Snowflake Cortex and RAG to do natural-language searches across a Snowflake table that contains data provided by Unstructured. Additional Snowflake Cortex functions are also explored.
1919
<br/>
2020
``Unstructured API`` ``Snowflake Cortex`` ``RAG Search`` ``Workflows`` ``S3``
2121
<br/>
2222
</Card>
23-
<Card title="Agentic RAG with LangGraph and Together AI" href="https://colab.research.google.com/drive/16JYOV3JwP2PJpQx-PGFshC64Cf91G7D1?usp=sharing">
23+
<Card title="Agentic RAG with LangGraph and Together AI" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/AgenticRAG_with_LangGraph,TogetherAI.ipynb">
2424
<br/>
2525
Build Agentic RAG with `LangGraph` and `Together AI` and compare the results with Vanilla RAG in pure Python
2626
<br/>
2727
``Unstructured API`` ``Workflows`` ``Agents`` ``LangGraph`` ``Together AI`` ``Astra DB``
2828
<br/>
2929
</Card>
30-
<Card title="Getting Started with Unstructured API and Snowflake" href="https://colab.research.google.com/drive/1Q7TClKZP7U3d2zLucqJgE-Es7sGD6-bX?usp=sharing">
30+
<Card title="Getting Started with Unstructured API and Snowflake" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Getting_Started_with_Unstructured_API_and_Snowflake.ipynb">
3131
<br/>
3232
Learn how to create data processing workflows with Unstructured API and its Python SDK to preprocess all of your unstructured data from your Azure Blob Storage into your Snowflake Table.
3333
<br/>
3434
``Unstructured API`` ``Workflows`` ``Azure Blob Storage`` ``Snowflake``
3535
<br/>
3636
</Card>
37-
<Card title="Building Graph-Based RAG Applications" href="https://colab.research.google.com/drive/1z-eJPqQEFoZKx7hgi6uq0cbkF44JszlZ?usp=sharing">
37+
<Card title="Building Graph-Based RAG Applications" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Building_Graph_Based_RAG_Applications_with_Unstructured_and_AstraDB.ipynb">
3838
<br/>
3939
Learn how to use the Unstructured API to create a Graph RAG-based workflow that writes data with named entity recognition (NER) to your Astra DB.
4040
<br/>
4141
``Unstructured API`` ``Workflows`` ``Graph RAG`` ``NER`` ``Astra DB``
4242
<br/>
4343
</Card>
44-
<Card title="Getting Started with Unstructured API and Delta Tables in Databricks" href="https://colab.research.google.com/drive/1ujLZoLx0ai0GAjvr9Xmze4UReZjKa8cU?usp=sharing">
44+
<Card title="Getting Started with Unstructured API and Delta Tables in Databricks" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Getting_Started_with_Unstructured_API_and_Delta_Tables_in_Databricks.ipynb">
4545
<br/>
4646
Learn how to create data processing workflows with Unstructured API and its Python SDK to preprocess all of your unstructured data into your Delta Table.
4747
<br/>
4848
``Unstructured API`` ``Workflows`` ``Databricks`` ``S3``
4949
<br/>
5050
</Card>
51-
<Card title="RAG for Online Documentation" href="https://colab.research.google.com/drive/1F1LkM_HwuwruQb5rwsK5Fj_up15hHeLi?usp=sharing">
51+
<Card title="RAG for Online Documentation" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/RAG_for_documentation.ipynb">
5252
<br/>
5353
Crawl websites with Firecrawl and build a RAG workflow powered by Unstructured and MongoDB Atlas vector search.
5454
<br/>
5555
``Unstructured API`` ``Workflows`` ``MongoDB``
5656
<br/>
5757
</Card>
58-
<Card title="Unstructured Workflow Endpoint Quickstart" href="https://colab.research.google.com/drive/13f5C9WtUvIPjwJzxyOR3pNJ9K9vnF4ww">
58+
<Card title="Unstructured Workflow Endpoint Quickstart" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Unstructured_Platform_Workflow_Endpoint_Quickstart.ipynb">
5959
<br/>
6060
Build an end-to-end workflow in Unstructured programmatically by using the Unstructured Workflow Endpoint.
6161
<br/>
6262
``Unstructured API`` ``Workflows`` ``S3``
6363
<br/>
6464
</Card>
65-
<Card title="RAG with Databricks Vector Search with Context from Multiple Sources" href="https://colab.research.google.com/drive/1ZsrqYVhBAqsr6L98xlVjTZltMzNi_P3o?usp=sharing">
65+
<Card title="RAG with Databricks Vector Search with Context from Multiple Sources" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Delta_Tables_Databricks_Multiple_Sources.ipynb">
6666
<br/>
6767
Build RAG with Databricks Vector Search with context preprocessed from multiple sources by Unstructured.
6868
<br/>
6969
``Databricks`` ``Introductory notebook``
7070
<br/>
7171
</Card>
7272

73-
<Card title="Agentic RAG with Hugging Face smolagents vs Vanilla RAG" href="https://colab.research.google.com/drive/1hG3dPgd8wjrO9wSD0K0Feo7EY1iXqrEN?usp=sharing">
73+
<Card title="Agentic RAG with Hugging Face smolagents vs Vanilla RAG" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Agentic_RAG_with_HuggingFace_smolagents.ipynb">
7474
<br/>
7575
Build Agentic RAG with `smolagents` library and compare the results with Vanilla RAG in pure Python
7676
<br/>
7777
``GPT-4o`` ``smolagents`` ``Agents`` ``DataStax`` ``S3`` ``Advanced notebook``
7878
<br/>
7979
</Card>
80-
<Card title="LLama3.2 RAG evaluation on unstructured text" href="https://colab.research.google.com/drive/14y3QOmhetk6NvJfT3HO4Tw6-Ra-LIDA9?usp=sharing">
80+
<Card title="LLama3.2 RAG evaluation on unstructured text" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Llama3_2_RAG_evaluation_on_Unstructured_Text_via_VLM.ipynb">
8181
<br/>
8282
Evaluate Llama3.2 for your RAG system with Unstructured, GPT-4o, Ragas, and LangChain
8383
<br/>
8484
``GPT-4o`` ``Ragas`` ``LangChain`` ``Llama3.2`` ``Pinecone`` ``S3`` ``Advanced notebook``
8585
<br/>
8686
</Card>
87-
<Card title="Multimodal RAG: Enhancing RAG outputs with image results" href="https://colab.research.google.com/drive/1gBI67HKyepmpAzf0T5yWwBwIqVXK1Ea_?usp=sharing">
87+
<Card title="Multimodal RAG: Enhancing RAG outputs with image results" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Multimodal_RAG_with_image_results.ipynb">
8888
<br/>
8989
Process a file in S3 with Unstructured and return images in your RAG output
9090
<br/>
@@ -98,93 +98,93 @@ description: "Notebooks contain complete working sample code for end-to-end solu
9898
``Unstructured API`` ``Hex`` ``Advanced notebook``
9999
<br/>
100100
</Card>
101-
<Card title="PII removal with GLiNER in unstructured data ETL" href="https://colab.research.google.com/drive/1HwOMnGjrNbcHZ1vlhaAG0MSDBcwQfexF?usp=sharing">
101+
<Card title="PII removal with GLiNER in unstructured data ETL" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/PII_removal_in_unstructured_data_ETL.ipynb">
102102
<br/>
103103
Remove Personally Identifiable Information (PII) as a part of unstructured data preprocessing.
104104
<br/>
105105
``Unstructured API`` ``PII`` ``GLiNER`` ``Advanced notebook``
106106
<br/>
107107
</Card>
108-
<Card title="Custom metadata extraction and self-querying retrieval" href="https://colab.research.google.com/drive/1v2SKPEmCr0p2AFyckrYvC3heYKyKff_t?usp=sharing">
108+
<Card title="Custom metadata extraction and self-querying retrieval" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/custom_metadata_self_querying_rag_mongodb_unstructured_langgraph.ipynb">
109109
<br/>
110110
Extract custom metadata, and enable metadata pre-filtering in your RAG.
111111
<br/>
112112
``Unstructured API`` ``MongoDB`` ``Metadata`` ``Advanced notebook``
113113
<br/>
114114
</Card>
115-
<Card title="Selecting an embedding model for custom data" href="https://colab.research.google.com/drive/132oXSGSOyzZ7GO9pJhRKlvwY4F-i9Pm6?usp=sharing">
115+
<Card title="Selecting an embedding model for custom data" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Selecting_an_embedding_model_for_custom_data.ipynb">
116116
<br/>
117117
End-to-end data processing pipeline using Unstructured Serverless API.
118118
<br/>
119119
``Unstructured API`` ``Hugging Face`` ``Advanced notebook``
120120
<br/>
121121
</Card>
122-
<Card title="RAG with PDFs, LangChain and Llama 3" href="https://colab.research.google.com/drive/1BJYYyrPVe0_9EGyXqeNyzmVZDrCRZwsg?usp=sharing">
122+
<Card title="RAG with PDFs, LangChain and Llama 3" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/RAG_Llama3_Unstructured_LangChain.ipynb">
123123
<br/>
124124
A RAG system with the Llama 3 model from Hugging Face.
125125
<br/>
126126
```Unstructured API``` ```🤗 Hugging Face``` ```LangChain``` ```Llama 3``` ``Introductory notebook``
127127
</Card>
128-
<Card title="Unstructured data ETL from S3 to SingleStore DB" href="https://colab.research.google.com/drive/1Krvn5XlYNERQe7DNIXKEz3AmESJdABLF?usp=sharing">
128+
<Card title="Unstructured data ETL from S3 to SingleStore DB" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Unstructured_data_ETL_from_S3_to_SingleStore.ipynb">
129129
<br/>
130130
Learn to ingest, partition, chunk, embed and load data from an S3 bucket into SingleStore DB.
131131
<br/>
132132
```Unstructured API``` ```SingleStoreDB``` ```AWS S3``` ``Introductory notebook``
133133
</Card>
134-
<Card title="Google Drive to DataStax Astra DB" href="https://colab.research.google.com/drive/1Img_qGCTKavImbz7dlRtxg8mNUjiWjJT?usp=sharing">
134+
<Card title="Google Drive to DataStax Astra DB" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Unstructured_Google_Docs_to_Astra.ipynb">
135135
<br/>
136136
Embed your Google Drive Docs in an Astra Vector Database with Unstructured Serverless API
137137
<br/>
138138
``Unstructured API`` ``Google`` ``DataStax`` ``Introductory notebook``
139139
<br/>
140140
</Card>
141-
<Card title="Weaviate RAG quickstart" href="https://colab.research.google.com/drive/1lYejikVtbPraWvh9CtxBazJuXKuJ38wu?usp=sharing">
141+
<Card title="Weaviate RAG quickstart" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Unstructured_Weaviate_Quickstart_OpenAI.ipynb">
142142
<br/>
143143
Embed your local documents in an Weaviate Vector Database with Unstructured Serverless API
144144
<br/>
145145
``Unstructured API`` ``OpenAI`` ``Weaviate`` ``Introductory notebook``
146146
<br/>
147147
</Card>
148-
<Card title="Preprocess PDFs in AWS S3, load into Elasticsearch" href="https://colab.research.google.com/drive/1axADo7T_dMkeOWnZ5C4dKve16-wrtQuV?usp=sharing">
148+
<Card title="Preprocess PDFs in AWS S3, load into Elasticsearch" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/S3_to_Elasticsearch_with_Unstructured.ipynb">
149149
<br/>
150150
Ingest PDF documents from an S3 bucket, transform them into a normalized JSON with Unstructured Serverless API, chunk, embed and load into Elasticsearch.
151151
<br/>
152152
``Unstructured API`` ``AWS S3`` ``Elasticsearch`` ``Introductory notebook``
153153
<br/>
154154
</Card>
155-
<Card title="Preprocess documents in Google Drive, load into Databricks Volume" href="https://colab.research.google.com/drive/1gVd03geFUD_OTROMuhjVAHYQvgVbViq7?usp=sharing">
155+
<Card title="Preprocess documents in Google Drive, load into Databricks Volume" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/GoogleDrive_to_Databricks_Connector.ipynb">
156156
<br/>
157157
Preprocess documents from a Google Drive Unstructured Serverless API and load them into Databricks Volume.
158158
<br/>
159159
``Unstructured API`` ``Google Drive`` ``Databricks`` ``Introductory notebook``
160160
<br/>
161161
</Card>
162-
<Card title="Source references in RAG responses" href="https://colab.research.google.com/drive/1Lc8eq8P87JjzUhbYb33_c7h7njsWb-hn?usp=sharing">
162+
<Card title="Source references in RAG responses" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/RAG_on_arXiv_papers_with_source_references.ipynb">
163163
<br/>
164164
Add document source references to RAG responses based on documents metadata.
165165
<br/>
166166
``Unstructured API`` ``RAG`` ``LangChain`` ``Intermediate notebook``
167167
<br/>
168168
</Card>
169-
<Card title="Query processed PDF with HuggingChat" href="https://colab.research.google.com/drive/1rNVVX5qo7vyBwR7wTa-zS6lDMkKpTei0?usp=sharing">
169+
<Card title="Query processed PDF with HuggingChat" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/PDF_with_Unstructured_and_HuggingChat.ipynb">
170170
<br/>
171171
Send a PDF to Unstructured for processing, and send a subset of the returned PDF's processed text to [HuggingChat](https://huggingface.co/chat/) for chatbot-style querying.
172172
<br/>
173173
```Unstructured API``` ```🤗 Hugging Face``` ```🤗 HuggingChat``` ``Introductory notebook``
174174
</Card>
175-
<Card title="Llama 3 Local RAG with emails" href="https://colab.research.google.com/drive/1ieDJ4LoxARrHFqxXWif8Lv8e8aZTgmtH?usp=sharing">
175+
<Card title="Llama 3 Local RAG with emails" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Local_RAG_with_emails.ipynb">
176176
<br/>
177177
Build a local RAG app for your emails with Unstructured, LangChain and Ollama.
178178
<br/>
179179
```Unstructured API``` ```LangChain``` ```Ollama``` ```Llama 3``` ``Introductory notebook``
180180
</Card>
181-
<Card title="Building RAG With PowerPoint presentations" href="https://colab.research.google.com/drive/1l_e7CyqfBUxBBDc6E6XIKtKV-8YWvmRX?usp=sharing">
181+
<Card title="Building RAG With PowerPoint presentations" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Building_RAG_with_Powerpoint_presentations.ipynb">
182182
<br/>
183183
A RAG solution that is based on PowerPoint files.
184184
<br/>
185185
```Unstructured API``` ```🤗 Hugging Face``` ```LangChain``` ```Llama 3``` ``Introductory notebook``
186186
</Card>
187-
<Card title="Synthetic test dataset generation" href="https://colab.research.google.com/drive/1VvOauC46xXeZrhh8nlTyv77yvoroMQjr?usp=sharing">
187+
<Card title="Synthetic test dataset generation" href="https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/RAG_synthetic_test_data_with_Unstructured_GPT_4o_and_Ragas.ipynb">
188188
<br/>
189189
Build a Synthetic Test Dataset for your RAG system in 5 easy steps
190190
<br/>

open-source/introduction/quick-start.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -211,7 +211,7 @@ import SharedOSSSingleFile from '/snippets/general-shared-text/multi-file-oss-us
211211
- Learn about [available cleaning functions](/open-source/core-functionality/cleaning) for cleaning up your document elements' data as needed.
212212
- Learn about [available extraction functions](/open-source/core-functionality/extracting) for getting precise information out of your document elements as needed.
213213
- Learn about how to [generate vector embeddings](/open-source/core-functionality/embedding) for the text in your document elements for use in RAG applications, AI agents, model fine-tuning tasks, and more.
214-
- For an additional code example, see the [Unstructured Quick Tour](https://colab.research.google.com/drive/1U8VCjY2-x8c6y5TYMbSFtQGlQVFHCVIW) Google Colab notebook.
214+
- For an additional code example, see the [Unstructured Quick Tour](https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Unstructured_Quick_Tour.ipynb) Google Colab notebook.
215215
- The Unstructured open source library is also available as a [Docker container](/open-source/installation/docker-installation).
216216
- The [Unstructured Ingest CLI and Unstructured Ingest Python library](/ingestion/overview) build upon the Unstructured open source library by providing additional functionality such as batch file processing,
217217
ingesting files from remote source locations and sending the processed files' data to remote destination locations, creating programmatic ETL pipelines, optionally processing files on Unstructured-hosted compute resource instead of locally for improved performance and quality on a pay-as-you-go basis, and more.

snippets/general-shared-text/first-time-api-destination-connector.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
[source connector](/api-reference/workflow/sources/overview) to a [workflow](/api-reference/workflow/overview#workflows).
1515
Then run the worklow as a [job](/api-reference/workflow/overview#jobs). To learn how, try out the
1616
[hands-on Workflow Endpoint quickstart](/api-reference/workflow/overview#quickstart),
17-
go directly to the [quickstart notebook](https://colab.research.google.com/drive/13f5C9WtUvIPjwJzxyOR3pNJ9K9vnF4ww),
17+
go directly to the [quickstart notebook](https://colab.research.google.com/github/Unstructured-IO/notebooks/blob/main/notebooks/Unstructured_Platform_Workflow_Endpoint_Quickstart.ipynb),
1818
or watch the two 4-minute video tutorials for the [Unstructured Python SDK](/api-reference/workflow/overview#unstructured-python-sdk).
1919

2020
You can also create destination connectors with the Unstructured user interface (UI).

0 commit comments

Comments
 (0)