Skip to content

Commit c7066ee

Browse files
2 parents 06b2c58 + bec4bf5 commit c7066ee

File tree

103 files changed

+14916
-3818
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

103 files changed

+14916
-3818
lines changed

README.md

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -127,8 +127,6 @@ Allow unauthenticated request : Yes
127127
## ENV
128128
| Env Variable Name | Mandatory/Optional | Default Value | Description |
129129
|-------------------------|--------------------|---------------|--------------------------------------------------------------------------------------------------|
130-
| OPENAI_API_KEY | Mandatory | | API key for OpenAI |
131-
| DIFFBOT_API_KEY | Mandatory | | API key for Diffbot |
132130
| EMBEDDING_MODEL | Optional | all-MiniLM-L6-v2 | Model for generating the text embedding (all-MiniLM-L6-v2 , openai , vertexai) |
133131
| IS_EMBEDDING | Optional | true | Flag to enable text embedding |
134132
| KNN_MIN_SCORE | Optional | 0.94 | Minimum score for KNN algorithm |
@@ -155,9 +153,35 @@ Allow unauthenticated request : Yes
155153
| GCS_FILE_CACHE | Optional | False | If set to True, will save the files to process into GCS. If set to False, will save the files locally |
156154
| ENTITY_EMBEDDING | Optional | False | If set to True, It will add embeddings for each entity in database |
157155
| LLM_MODEL_CONFIG_ollama_<model_name> | Optional | | Set ollama config as - model_name,model_local_url for local deployments |
156+
| RAGAS_EMBEDDING_MODEL | Optional | openai | embedding model used by ragas evaluation framework |
158157

159158

160-
159+
## For local llms (Ollama)
160+
1. Pull the docker imgage of ollama
161+
```bash
162+
docker pull ollama/ollama
163+
```
164+
2. Run the ollama docker image
165+
```bash
166+
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
167+
```
168+
3. Execute any llm model ex🦙3
169+
```bash
170+
docker exec -it ollama ollama run llama3
171+
```
172+
4. Configure env variable in docker compose or backend enviournment.
173+
```env
174+
LLM_MODEL_CONFIG_ollama_<model_name>
175+
#example
176+
LLM_MODEL_CONFIG_ollama_llama3=${LLM_MODEL_CONFIG_ollama_llama3-llama3,
177+
http://host.docker.internal:11434}
178+
```
179+
5. Configure the backend API url
180+
```env
181+
VITE_BACKEND_API_URL=${VITE_BACKEND_API_URL-backendurl}
182+
```
183+
6. Open the application in browser and select the ollama model for the extraction.
184+
7. Enjoy Graph Building.
161185

162186

163187
## Usage

backend/example.env

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ OPENAI_API_KEY = ""
22
DIFFBOT_API_KEY = ""
33
GROQ_API_KEY = ""
44
EMBEDDING_MODEL = "all-MiniLM-L6-v2"
5+
RAGAS_EMBEDDING_MODEL = "openai"
56
IS_EMBEDDING = "true"
67
KNN_MIN_SCORE = "0.94"
78
# Enable Gemini (default is False) | Can be False or True
@@ -28,11 +29,17 @@ ENTITY_EMBEDDING="" True or False
2829
DUPLICATE_SCORE_VALUE = ""
2930
DUPLICATE_TEXT_DISTANCE = ""
3031
#examples
32+
LLM_MODEL_CONFIG_openai_gpt_3.5="gpt-3.5-turbo-0125,openai_api_key"
33+
LLM_MODEL_CONFIG_openai_gpt_4o_mini="gpt-4o-mini-2024-07-18,openai_api_key"
34+
LLM_MODEL_CONFIG_gemini_1.5_pro="gemini-1.5-pro-002"
35+
LLM_MODEL_CONFIG_gemini_1.5_flash="gemini-1.5-flash-002"
36+
LLM_MODEL_CONFIG_diffbot="diffbot,diffbot_api_key"
3137
LLM_MODEL_CONFIG_azure_ai_gpt_35="azure_deployment_name,azure_endpoint or base_url,azure_api_key,api_version"
3238
LLM_MODEL_CONFIG_azure_ai_gpt_4o="gpt-4o,https://YOUR-ENDPOINT.openai.azure.com/,azure_api_key,api_version"
3339
LLM_MODEL_CONFIG_groq_llama3_70b="model_name,base_url,groq_api_key"
3440
LLM_MODEL_CONFIG_anthropic_claude_3_5_sonnet="model_name,anthropic_api_key"
3541
LLM_MODEL_CONFIG_fireworks_llama_v3_70b="model_name,fireworks_api_key"
3642
LLM_MODEL_CONFIG_bedrock_claude_3_5_sonnet="model_name,aws_access_key_id,aws_secret__access_key,region_name"
3743
LLM_MODEL_CONFIG_ollama_llama3="model_name,model_local_url"
44+
YOUTUBE_TRANSCRIPT_PROXY="https://user:pass@domain:port"
3845

backend/requirements.txt

Lines changed: 20 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -69,22 +69,22 @@ jsonpath-python==1.0.6
6969
jsonpointer==2.4
7070
json-repair==0.25.2
7171
kiwisolver==1.4.5
72-
langchain
73-
langchain-aws
74-
langchain-anthropic
75-
langchain-fireworks
76-
langchain-google-genai
77-
langchain-community
78-
langchain-core
79-
langchain-experimental
80-
langchain-google-vertexai
81-
langchain-groq
82-
langchain-openai
83-
langchain-text-splitters
72+
langchain==0.3.0
73+
langchain-aws==0.2.1
74+
langchain-anthropic==0.2.1
75+
langchain-fireworks==0.2.0
76+
langchain-google-genai==2.0.0
77+
langchain-community==0.3.0
78+
langchain-core==0.3.5
79+
langchain-experimental==0.3.1
80+
langchain-google-vertexai==2.0.1
81+
langchain-groq==0.2.0
82+
langchain-openai==0.2.0
83+
langchain-text-splitters==0.3.0
8484
langdetect==1.0.9
85-
langsmith==0.1.83
85+
langsmith==0.1.128
8686
layoutparser==0.3.4
87-
langserve==0.2.2
87+
langserve==0.3.0
8888
#langchain-cli==0.0.25
8989
lxml==5.1.0
9090
MarkupSafe==2.1.5
@@ -100,7 +100,7 @@ numpy==1.26.4
100100
omegaconf==2.3.0
101101
onnx==1.16.1
102102
onnxruntime==1.18.1
103-
openai==1.35.10
103+
openai==1.47.1
104104
opencv-python==4.8.0.76
105105
orjson==3.9.15
106106
packaging==23.2
@@ -140,12 +140,10 @@ requests==2.32.3
140140
rsa==4.9
141141
s3transfer==0.10.1
142142
safetensors==0.4.1
143-
scipy==1.10.1
144143
shapely==2.0.3
145144
six==1.16.0
146145
sniffio==1.3.1
147146
soupsieve==2.5
148-
SQLAlchemy==2.0.28
149147
starlette==0.37.2
150148
sse-starlette==2.1.2
151149
starlette-session==0.4.3
@@ -160,7 +158,7 @@ transformers==4.42.3
160158
types-protobuf
161159
types-requests
162160
typing-inspect==0.9.0
163-
typing_extensions==4.9.0
161+
typing_extensions==4.12.2
164162
tzdata==2024.1
165163
unstructured==0.14.9
166164
unstructured-client==0.23.8
@@ -179,3 +177,7 @@ sentence-transformers==3.0.1
179177
google-cloud-logging==3.10.0
180178
PyMuPDF==1.24.5
181179
pypandoc==1.13
180+
graphdatascience==1.10
181+
Secweb==1.11.0
182+
ragas==0.1.14
183+

0 commit comments

Comments
 (0)