Skip to content
This repository was archived by the owner on Oct 15, 2025. It is now read-only.

Commit 7b4f875

Browse files
authored
docs: Update README.md
1 parent 777d391 commit 7b4f875

File tree

1 file changed

+30
-24
lines changed

1 file changed

+30
-24
lines changed

README.md

Lines changed: 30 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# EvaDB AI-SQL Database System
1+
# EvaDB: Database System for AI Apps
22

33
<div>
44
<a href="https://colab.research.google.com/github/georgia-tech-db/eva/blob/master/tutorials/03-emotion-analysis.ipynb">
@@ -24,9 +24,9 @@
2424

2525
<p align="center"> <b><h3>EvaDB is a database system for building simpler and faster AI-powered applications.</b></h3> </p>
2626

27-
EvaDB is a database system for developing AI apps. We aim to simplify the development and deployment of AI-powered apps that operate on unstructured data (text documents, videos, PDFs, podcasts, etc.) and structured data (tables, vector index).
27+
EvaDB is a database system for developing AI apps. We aim to simplify the development and deployment of AI apps that operate on unstructured data (text documents, videos, PDFs, podcasts, etc.) and structured data (tables, vector index).
2828

29-
The high-level Python and SQL APIs allow beginners to use EvaDB in a few lines of code. Advanced users can define custom user-defined functions that wrap around any AI model or Python library. EvaDB is fully implemented in Python and licensed under the Apache license.
29+
The high-level Python and SQL APIs allow beginners to use EvaDB in a few lines of code. Advanced users can define custom user-defined functions that wrap around any AI model or Python library. EvaDB is fully implemented in Python and licensed under an Apache license.
3030

3131
## Quick Links
3232

@@ -38,17 +38,17 @@ The high-level Python and SQL APIs allow beginners to use EvaDB in a few lines o
3838

3939
## Features
4040

41-
- 🔮 Build simpler AI-powered applications using Python functions or SQL queries
41+
- 🔮 Build simpler AI-powered apps using Python functions or SQL queries
4242
- ⚡️ 10x faster applications using AI-centric query optimization
43-
- 💰 Save money spent on GPUs
43+
- 💰 Save money spent on inference
4444
- 🚀 First-class support for your custom deep learning models through user-defined functions
4545
- 📦 Built-in caching to eliminate redundant model invocations across queries
46-
- ⌨️ First-class support for PyTorch, Hugging Face, YOLO, and Open AI models
46+
- ⌨️ Integrations for PyTorch, Hugging Face, YOLO, and Open AI models
4747
- 🐍 Installable via pip and fully implemented in Python
4848

4949
## Illustrative Applications
5050

51-
Here are some illustrative EvaDB-powered applications (each Jupyter notebook can be opened on Google Colab):
51+
Here are some illustrative AI apps built using EvaDB (each notebook can be opened on Google Colab):
5252

5353
* 🔮 <a href="https://evadb.readthedocs.io/en/stable/source/tutorials/13-privategpt.html">PrivateGPT</a>
5454
* 🔮 <a href="https://evadb.readthedocs.io/en/stable/source/tutorials/08-chatgpt.html">ChatGPT-based Video Question Answering</a>
@@ -68,43 +68,41 @@ Here are some illustrative EvaDB-powered applications (each Jupyter notebook can
6868

6969
## Quick Start
7070

71-
- Step 1: Install EvaDB using pip. EvaDB supports Python versions >= `3.8`:
71+
- Step 1: Install EvaDB using `pip`. EvaDB supports Python versions >= `3.8`:
7272

7373
```shell
7474
pip install evadb
7575
```
7676

77-
- Step 2: Write your AI app!
77+
- Step 2: It's time to write an AI app.
7878

7979
```python
8080
import evadb
8181

82-
# Grab a EvaDB cursor to load data and run queries
82+
# Grab a EvaDB cursor to load data into tables and run AI queries
8383
cursor = evadb.connect().cursor()
8484

8585
# Load a collection of news videos into the 'news_videos' table
86-
# This command returns a Pandas Dataframe with the query's output
87-
# In this case, the output indicates the number of loaded videos
86+
# This function returns a Pandas dataframe with the query's output
87+
# In this case, the output dataframe indicates the number of loaded videos
8888
cursor.load(
8989
file_regex="news_videos/*.mp4",
9090
format="VIDEO",
9191
table_name="news_videos"
9292
).df()
9393

9494
# Define a function that wraps around your deep learning model
95-
# Here, this function wraps around an off-the-shelf speech-to-text (Whisper) model
96-
# Such functions are known as user-defined functions or UDFs
97-
# So, we are creating a Whisper UDF here
98-
# After creating the UDF, we can use the function in any query
99-
cursor.create_udf(
95+
# Here, this function wraps around a speech-to-text model
96+
# After registering the function, we can use the registered function in subsequent queries
97+
cursor.create_function(
10098
udf_name="SpeechRecognizer",
10199
type="HuggingFace",
102100
task='automatic-speech-recognition',
103101
model='openai/whisper-base'
104102
).df()
105103

106-
# EvaDB automatically extract the audio from the video
107-
# We only need to run the SpeechRecongizer UDF on the 'audio' column
104+
# EvaDB automatically extracts the audio from the video
105+
# We only need to run the SpeechRecongizer function on the 'audio' column
108106
# to get the transcript and persist it in a table called 'transcripts'
109107
cursor.query(
110108
"""CREATE TABLE transcripts AS
@@ -123,13 +121,16 @@ os.environ["OPENAI_KEY"] = OPENAI_KEY
123121
query = query.select("ChatGPT('Is this video summary related to LLMs', text)")
124122

125123
# Finally, we run the query to get the results as a dataframe
124+
# You can then post-process the dataframe using other Python libraries
126125
response = query.df()
127126
```
128127

129-
- **Chain multiple models in a single query to set up useful AI pipelines**
128+
- **Incrementally build an AI query that chains together multiple models**
129+
130+
Here is a AI query that analyses emotions of actors in an `Interstellar` movie clip using multiple PyTorch models.
130131

131132
```python
132-
# Analyse emotions of actors in an Interstellar movie clip using PyTorch models
133+
# Access the Interstellar movie clip table using a cursor
133134
query = cursor.table("Interstellar")
134135
# Get faces using a `FaceDetector` function
135136
query = query.cross_apply("UNNEST(FaceDetector(data))", "Face(bounding_box, confidence)")
@@ -139,9 +140,14 @@ query = query.filter("id > 100 AND id < 200")
139140
query = query.select("id, bbox, EmotionDetector(Crop(data, bounding_box))")
140141

141142
# Run the query and get the query result as a dataframe
143+
# At each of the above steps, you can run the query and see the output
144+
# If you are familiar with SQL, you can get the SQL query with query.sql_query()
142145
response = query.df()
143146
```
144-
- **EvaDB runs AI apps 10--100x faster using its AI-centric query optimizer**. Three key built-in optimizations are:
147+
148+
- **EvaDB runs AI apps 10x faster using its AI-centric query optimizer**.
149+
150+
Three key built-in optimizations are:
145151

146152
💾 **Caching**: EvaDB automatically caches and reuses model inference results.
147153

@@ -153,7 +159,7 @@ response = query.df()
153159

154160
This diagram presents the key components of EvaDB. EvaDB's AI-centric query optimizer takes a query as input and generates a query plan that is executed by the query engine. The query engine hits the relevant storage engines to quickly retrieve the data required for efficiently running the query:
155161
1. Structured data (SQL database system connected via `sqlalchemy`).
156-
2. Unstructured media data (on cloud buckets/local filesystem).
162+
2. Unstructured media data (PDFs, videos, etc. on cloud/local filesystem).
157163
3. Feature data (vector database system).
158164

159165
<img width="500" alt="Architecture Diagram" src="https://github.com/georgia-tech-db/eva/assets/5521975/01452ec9-87d9-4d27-90b2-c0b1ab29b16c">
@@ -212,5 +218,5 @@ For more information, see our
212218
[contribution guide](https://evadb.readthedocs.io/en/stable/source/contribute/index.html).
213219

214220
## License
215-
Copyright (c) 2018-present [Georgia Tech Database Group](http://db.cc.gatech.edu/).
221+
Copyright (c) 2018--present [Georgia Tech Database Group](http://db.cc.gatech.edu/).
216222
Licensed under [Apache License](LICENSE).

0 commit comments

Comments
 (0)