Skip to content

Commit f92361e

Browse files
prakriti-solankeypraveshkumar1988kartikpersistentvasanthasaikalluriaashipandya
authored
Dev (#322)
* Remove unused library and commented code * Issue fixed * 224 color mismatch in graph viz model (#225) * count changes * added legend count * bloom url changes * lint changes * removal of console --------- Co-authored-by: kartikpersistent <[email protected]> * Modified retrieval query (#226) * Manage file status (#227) * manage status of processing file * Remove progress bar from Generate Graph Document button * 224 color mismatch in graph viz model (#225) * count changes * added legend count * bloom url changes * lint changes * removal of console --------- Co-authored-by: kartikpersistent <[email protected]> * Modified retrieval query (#226) * Convert KNN score value string to Float --------- Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: vasanthasaikalluri <[email protected]> * Chatbot optimization (#230) * Optimised and cleaned Chatbot Integration * modified chat integration functions * bug changes (#231) * batch queries and relationship count correction (#232) * batch queries and relationship count correction * status should not be processing * 'url_changes' (#235) * Color mismatch in graph viz model (#233) * count changes * added legend count * bloom url changes * lint changes * removal of console * 'colour' * 'color' --------- Co-authored-by: kartikpersistent <[email protected]> * lint fixes * Create schema endpoint to get labels and relationtypes * source link fixes * Handle exception when youtube Api unable to fetch transcript youtube_transcript_api._errors.TranscriptsDisabled * configured backend status based the ENV Variable (#246) * configured backend status based the ENV Variable * removed the connection status check in PROD enviournment * Requirement split gcs and s3 icons on the page (#247) * separated S3 and GCS * resolved the conflicts * Update error message in response * dev env * Chatbot optimization (#250) * Optimised and cleaned Chatbot Integration * modified chat integration functions * Modified max_tokens and min_score * Modified prompt and added error message * Modified Prompt and error message * 245 bug chatbot UI (#252) * fixed chatbot aspect ratio/width issue * fixed chat bot ui issue * 'hoverchanges' (#254) * added settings panel for relationship type and node label selection (#234) * added settings panel for relationship type and node label selection * added checkbox for fetching existing scehma * integrated /schema api * added dependency in the useCallback * usercredentials payload fix * Accept param in Extract API to filter graph to allowedNode and allowedRealationship * CHange param type in extract * Issue fixed * integrated extract api * updated string as list for allowednodes and allowedrelations * removed button on settings * format fixes * Added baseEntityLabel as True --------- Co-authored-by: Pravesh Kumar <[email protected]> Co-authored-by: aashipandya <[email protected]> * Handle File status for long time (#256) * format fixes * fixed failed status bug * Fixed list.split issue in allowed nodes * Issue fixed * Updated check of empty allowed nodes and allowed relations list (#258) * added settings panel for relationship type and node label selection * added checkbox for fetching existing scehma * integrated /schema api * added dependency in the useCallback * usercredentials payload fix * Accept param in Extract API to filter graph to allowedNode and allowedRealationship * CHange param type in extract * Issue fixed * integrated extract api * updated string as list for allowednodes and allowedrelations * check for empty list of nodes and relations --------- Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: Pravesh Kumar <[email protected]> * Removed wrong commit * Updated condition for allowed nodes relations (#265) * added settings panel for relationship type and node label selection * added checkbox for fetching existing scehma * integrated /schema api * added dependency in the useCallback * usercredentials payload fix * Accept param in Extract API to filter graph to allowedNode and allowedRealationship * CHange param type in extract * Issue fixed * integrated extract api * updated string as list for allowednodes and allowedrelations * check for empty list of nodes and relations * condition updated * removed frontend changes --------- Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: Pravesh Kumar <[email protected]> * changed the checkbox to button (#266) * Adding link to Aura on the connection modal (#263) * Remove node id title changes (#264) * Remove the id and type changes from the nodes as that makes them incompatible with the relationships * common function for saving nodes and relations to graph --------- Co-authored-by: aashipandya <[email protected]> * fixed the legend container height issue (#267) * added supported files description (#268) * fixed legends gap issue * format fixes * parameter should be none not str (#269) * Chatbot latency optimization (#270) * Added graph Object and Modified Retrieval query * Added Database parameter to API * Modified Database parameter * added connect in place of submit ,added connect to neo4j aura in place of connect to neo4j (#271) * added connect in place of submit added connect to neo4j aura inplace of connect to neo4j * added open graph with bloom * removed the Aura as it can connect with any neo4j db * label colour fix (#273) * removed default Person and Works AT for allowed nodes and relationship types * changed the Wikipedia input label * removed unused constants * wikipedia whitespaces fix * wikipedia url and youtube white spaces error (#280) * urgent fix (#281) * Info in the chat response (#282) * Added graph Object and Modified Retrieval query * Added Database parameter to API * Modified Database parameter * Added info parameter to output * reestablished the sse on page refresh to sync the processing status (#285) * UI bugs/features (#284) * disabled the use existing schema on no node labels * added docs Icon * decreased the alert window in the success scenario * added trim for inputs for white space handling in the youtube wikipedia gcs * Time estimation alert for large files (#287) * reestablished the sse on page refresh to sync the processing status * added the time estimation message for large files * showing alert only once * delete api for removing documents (#290) * Show connection uri (#291) * added Connection URI * UI updated * removed duplicate useEffect * Backend queries (#257) * created backend queries for graph * Modified username parameter * Added GET request * Modified exceptions * 'frontendHandling' * removed session id parameter * doc_limit * 'type_changes' * 'nameChanges' * graph viz ui * legend renamed * renamed * removed import * removed duplicate useEffect --------- Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> Co-authored-by: Prakriti Solankey <[email protected]> * Delete list of documents from db (#293) * delete api for removing documents * Added list of documents for deletion * Update exception to track Json_payload * Delete api (#296) * delete api for removing documents * Added list of documents for deletion * added delete functionality --------- Co-authored-by: aashipandya <[email protected]> * Delete api (#298) * delete api for removing documents * Added list of documents for deletion * added delete functionality * changed the message and disabled the delete files if there is no selected files * format fixes --------- Co-authored-by: aashipandya <[email protected]> * removed duplicate variables * css change * upgraded the nvl package * removed duplicate delete button * closing the event source on failed condition * nvl issue 261 - private package (#299) * Fix issue #261 #261 * Fix issue #261 #261 --------- Co-authored-by: kartikpersistent <[email protected]> * Delete with entities switch (#300) * added delete entities switch * added the hover message on checkboxes * changed query for deletion of files * changed the font size the confimation message --------- Co-authored-by: aashipandya <[email protected]> * docker changes * disabled the checkbox when File status is uploading or processing * Added Cloud logging library for strucred logs * replaced switch with checkbox * removed unused imports * spell mistake * removed the cancel button on delete popup modal * bug_fix_labels_mismatch_count * deletion scenarios * fixed / trailing bug in s3 bucket url * Switch frontend port in docker-compose to 8080 to match with the frontend Dockerfile (#305) * Add in Each api google log struct * Implemented polling for status update (#309) * Implemented polling for status update * status updation for large files * added example env in the frontend * updated the readme with frontend env info * readme changes * readme updates * setting up failed status * Chatbot info icon (#297) * Added Info to the chat response * UI changes * Modified chat response * added entities to response info * modified entities in response info * Modified entities response count in info * clearhistory * chatbot * typeCheck * state management * chatbot-ui-overflow * css_changes --------- Co-authored-by: vasanthasaikalluri <[email protected]> * ellipsis * dockerfile * Failed status update fix (#315) * removed Failed status update on failure of servers side event * Update .gitignore * url spell fix * Msenechal/issue295 (#314) * Removed triton package from requirements.txt * Fixed Google Cloud logging + some docker ENV overwritten * Removed ENV print logs * delete local file in case processing failed (#316) * table-css --------- Co-authored-by: Pravesh Kumar <[email protected]> Co-authored-by: kartikpersistent <[email protected]> Co-authored-by: vasanthasaikalluri <[email protected]> Co-authored-by: aashipandya <[email protected]> Co-authored-by: Morgan Senechal <[email protected]> Co-authored-by: Michael Hunger <[email protected]>
1 parent 6124099 commit f92361e

26 files changed

+432
-128
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -164,4 +164,4 @@ google-cloud-sdk
164164
google-cloud-cli-469.0.0-linux-x86_64.tar.gz
165165
/data/llm-experiments-387609-c73d512ca3b1.json
166166
/backend/src/merged_files
167-
/backend/src/chunks
167+
/backend/src/chunks

README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,19 @@ Files can be uploaded from local machine or S3 bucket and then LLM model can be
55

66
### Getting started
77

8+
:warning:
9+
For the backend, if you want to run the LLM KG Builder locally, and don't need the GCP/VertexAI integration, make sure to have the following set in your ENV file :
10+
11+
```env
12+
GEMINI_ENABLED = False
13+
GCP_LOG_METRICS_ENABLED = False
14+
```
15+
16+
And for the frontend, make sure to export your local backend URL before running docker-compose by having the BACKEND_API_URL set in your ENV file :
17+
```env
18+
BACKEND_API_URL="http://localhost:8000"
19+
```
20+
821
1. Run Docker Compose to build and start all components:
922
```bash
1023
docker-compose up --build

backend/example.env

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,4 +14,8 @@ LANGCHAIN_PROJECT = ""
1414
LANGCHAIN_TRACING_V2 = ""
1515
LANGCHAIN_ENDPOINT = ""
1616
NUMBER_OF_CHUNKS_TO_COMBINE = ""
17-
# NUMBER_OF_CHUNKS_ALLOWED = ""
17+
# NUMBER_OF_CHUNKS_ALLOWED = ""
18+
# Enable Gemini (default is True)
19+
GEMINI_ENABLED = True|False
20+
# Enable Google Cloud logs (default is True)
21+
GCP_LOG_METRICS_ENABLED = True|False

backend/requirements.txt

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -147,7 +147,6 @@ timm==0.9.12
147147
tokenizers==0.15.2
148148
tqdm==4.66.2
149149
transformers==4.37.1
150-
triton==2.2.0
151150
types-protobuf
152151
types-requests
153152
typing-inspect==0.9.0

backend/score.py

Lines changed: 26 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -20,10 +20,11 @@
2020
import json
2121
from typing import List
2222
from google.cloud import logging as gclogger
23+
from src.logger import CustomLogger
2324

24-
logging_client = gclogger.Client()
25-
logger_name = "llm_experiments_metrics" # Saved in the google cloud logs
26-
logger = logging_client.logger(logger_name)
25+
logger = CustomLogger()
26+
CHUNK_DIR = os.path.join(os.path.dirname(__file__), "chunks")
27+
MERGED_DIR = os.path.join(os.path.dirname(__file__), "merged_files")
2728

2829
def healthy_condition():
2930
output = {"healthy": True}
@@ -45,7 +46,9 @@ def sick():
4546
allow_headers=["*"],
4647
)
4748

48-
add_routes(app,ChatVertexAI(), path="/vertexai")
49+
is_gemini_enabled = os.environ.get("GEMINI_ENABLED", "True").lower() in ("true", "1", "yes")
50+
if is_gemini_enabled:
51+
add_routes(app,ChatVertexAI(), path="/vertexai")
4952

5053
app.add_api_route("/health", health([healthy_condition, healthy]))
5154

@@ -135,8 +138,10 @@ async def extract_knowledge_graph_from_file(
135138
graph = create_graph_database_connection(uri, userName, password, database)
136139
graphDb_data_Access = graphDBdataAccess(graph)
137140
if source_type == 'local file':
141+
merged_file_path = os.path.join(MERGED_DIR,file_name)
142+
logging.info(f'File path:{merged_file_path}')
138143
result = await asyncio.to_thread(
139-
extract_graph_from_file_local_file, graph, model, file_name, allowedNodes, allowedRelationship)
144+
extract_graph_from_file_local_file, graph, model, file_name, merged_file_path, allowedNodes, allowedRelationship)
140145

141146
elif source_type == 's3 bucket' and source_url:
142147
result = await asyncio.to_thread(
@@ -160,9 +165,11 @@ async def extract_knowledge_graph_from_file(
160165
logger.log_struct(result)
161166
return create_api_response('Success', data=result, file_source= source_type)
162167
except Exception as e:
163-
message=f" Failed To Process File:{file_name} or LLM Unable To Parse Content"
168+
message=f"Failed To Process File:{file_name} or LLM Unable To Parse Content "
164169
error_message = str(e)
165170
graphDb_data_Access.update_exception_db(file_name,error_message)
171+
if source_type == 'local file':
172+
delete_uploaded_local_file(merged_file_path, file_name)
166173
josn_obj = {'message':message,'error_message':error_message, 'file_name': file_name,'status':'Failed','db_url':uri,'failed_count':1, 'source_type': source_type}
167174
logger.log_struct(josn_obj)
168175
logging.exception(f'File Failed in extraction: {josn_obj}')
@@ -253,6 +260,18 @@ async def graph_query(
253260
logging.exception(f'Exception in graph query: {error_message}')
254261
return create_api_response(job_status, message=message, error=error_message)
255262

263+
@app.post("/clear_chat_bot")
264+
async def clear_chat_bot(uri=Form(None),userName=Form(None), password=Form(None), database=Form(None), session_id=Form(None)):
265+
try:
266+
graph = create_graph_database_connection(uri, userName, password, database)
267+
result = await asyncio.to_thread(clear_chat_history,graph=graph,session_id=session_id)
268+
return create_api_response('Success',data=result)
269+
except Exception as e:
270+
job_status = "Failed"
271+
message="Unable to clear chat History"
272+
error_message = str(e)
273+
logging.exception(f'Exception in chat bot:{error_message}')
274+
return create_api_response(job_status, message=message, error=error_message)
256275
@app.post("/connect")
257276
async def connect(uri=Form(None), userName=Form(None), password=Form(None), database=Form(None)):
258277
try:
@@ -274,7 +293,7 @@ async def upload_large_file_into_chunks(file:UploadFile = File(...), chunkNumber
274293
password=Form(None), database=Form(None)):
275294
try:
276295
graph = create_graph_database_connection(uri, userName, password, database)
277-
result = await asyncio.to_thread(upload_file, graph, model, file, chunkNumber, totalChunks, originalname)
296+
result = await asyncio.to_thread(upload_file, graph, model, file, chunkNumber, totalChunks, originalname, CHUNK_DIR, MERGED_DIR)
278297
josn_obj = {'api_name':'upload','db_url':uri}
279298
logger.log_struct(josn_obj)
280299
return create_api_response('Success', message=result)

backend/src/QA_integration.py

Lines changed: 57 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -105,21 +105,35 @@ def get_llm(model: str,max_tokens=1000) -> Any:
105105
)
106106
else:
107107
llm = ChatOpenAI(model=model_version, temperature=0,max_tokens=max_tokens)
108-
return llm
108+
109+
return llm,model_version
109110

110111
else:
111112
logging.error(f"Unsupported model: {model}")
112-
return None
113+
return None,None
113114

114115
def vector_embed_results(qa,question):
115116
vector_res={}
116117
try:
117118
result = qa({"query": question})
118119
vector_res['result']=result.get("result")
119-
list_source_docs=[]
120-
for i in result["source_documents"]:
121-
list_source_docs.append(i.metadata['source'])
122-
vector_res['source']=list_source_docs
120+
121+
sources = set()
122+
entities = set()
123+
for document in result["source_documents"]:
124+
sources.add(document.metadata["source"])
125+
for entiti in document.metadata["entities"]:
126+
entities.add(entiti)
127+
vector_res['source']=list(sources)
128+
vector_res['entities'] = list(entities)
129+
if len( vector_res['entities']) > 5:
130+
vector_res['entities'] = vector_res['entities'][:5]
131+
132+
# list_source_docs=[]
133+
# for i in result["source_documents"]:
134+
# list_source_docs.append(i.metadata['source'])
135+
# vector_res['source']=list_source_docs
136+
123137
# result = qa({"question":question},return_only_outputs=True)
124138
# vector_res['result'] = result.get("answer")
125139
# vector_res["source"] = result.get("sources")
@@ -145,6 +159,7 @@ def save_chat_history(history,user_message,ai_message):
145159

146160
def get_chat_history(llm, history):
147161
"""Retrieves and summarizes the chat history for a given session."""
162+
148163
try:
149164
# history = Neo4jChatMessageHistory(
150165
# graph=graph,
@@ -170,6 +185,26 @@ def get_chat_history(llm, history):
170185
logging.exception(f"Exception in retrieving chat history: {e}")
171186
return ""
172187

188+
def clear_chat_history(graph, session_id):
189+
190+
try:
191+
logging.info(f"Clearing chat history for session ID: {session_id}")
192+
history = Neo4jChatMessageHistory(
193+
graph=graph,
194+
session_id=session_id
195+
)
196+
history.clear()
197+
logging.info("Chat history cleared successfully")
198+
199+
return {
200+
"session_id": session_id,
201+
"message": "The chat history is cleared",
202+
"user": "chatbot"
203+
}
204+
except Exception as e:
205+
logging.exception(f"Error occurred while clearing chat history for session ID {session_id}: {e}")
206+
207+
173208
def extract_and_remove_source(message):
174209
pattern = r'\[Source: ([^\]]+)\]'
175210
match = re.search(pattern, message)
@@ -206,6 +241,7 @@ def QA_RAG(graph,model,question,session_id):
206241
try:
207242
qa_rag_start_time = time.time()
208243

244+
209245
start_time = time.time()
210246
neo_db = Neo4jVector.from_existing_index(
211247
embedding=EMBEDDING_FUNCTION,
@@ -219,7 +255,8 @@ def QA_RAG(graph,model,question,session_id):
219255
session_id=session_id
220256
)
221257

222-
llm = get_llm(model=model,max_tokens=CHAT_MAX_TOKENS)
258+
llm,model_version = get_llm(model=model,max_tokens=CHAT_MAX_TOKENS)
259+
223260
qa = RetrievalQA.from_chain_type(
224261
llm=llm,
225262
chain_type="stuff",
@@ -278,20 +315,25 @@ def QA_RAG(graph,model,question,session_id):
278315
return {
279316
"session_id": session_id,
280317
"message": message,
281-
"sources": sources,
282-
"info": f"""Metadata :
283-
RETRIEVAL_QUERY : {RETRIEVAL_QUERY}""",
318+
"info": {
319+
"sources": sources,
320+
"model":model_version,
321+
"entities":vector_res["entities"]
322+
},
284323
"user": "chatbot"
285324
}
286325

287326
except Exception as e:
288327
logging.exception(f"Exception in QA component at {datetime.now()}: {str(e)}")
289328
error_name = type(e).__name__
290-
return {"session_id": session_id,
291-
"message": "Something went wrong",
292-
"sources": [],
293-
"info": f"Caught an exception {error_name} :- {str(e)}",
294-
"user": "chatbot"}
329+
return {
330+
"session_id": session_id,
331+
"message": "Something went wrong",
332+
"info": {
333+
"sources": [],
334+
"error": f"{error_name} :- {str(e)}"
335+
},
336+
"user": "chatbot"}
295337

296338

297339

backend/src/logger.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
import os
2+
from google.cloud import logging as gclogger
3+
4+
class CustomLogger:
5+
def __init__(self):
6+
self.is_gcp_log_enabled = os.environ.get("GCP_LOG_METRICS_ENABLED", "True").lower() in ("true", "1", "yes")
7+
if self.is_gcp_log_enabled:
8+
self.logging_client = gclogger.Client()
9+
self.logger_name = "llm_experiments_metrics"
10+
self.logger = self.logging_client.logger(self.logger_name)
11+
else:
12+
self.logger = None
13+
14+
def log_struct(self, message):
15+
if self.is_gcp_log_enabled:
16+
self.logger.log_struct(message)
17+
else:
18+
print(message)

backend/src/main.py

Lines changed: 14 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@
2020
import sys
2121
import shutil
2222
warnings.filterwarnings("ignore")
23-
from pathlib import Path
2423
load_dotenv()
2524
logging.basicConfig(format='%(asctime)s - %(message)s',level='INFO')
2625

@@ -146,13 +145,11 @@ def create_source_node_graph_url_wikipedia(graph, model, wiki_query, source_type
146145
lst_file_name.append({'fileName':obj_source_node.file_name,'fileSize':obj_source_node.file_size,'url':obj_source_node.url, 'status':'Failed'})
147146
return lst_file_name,success_count,failed_count
148147

149-
def extract_graph_from_file_local_file(graph, model, fileName, allowedNodes, allowedRelationship):
148+
def extract_graph_from_file_local_file(graph, model, fileName, merged_file_path, allowedNodes, allowedRelationship):
150149

151150
logging.info(f'Process file name :{fileName}')
152-
merged_file_path = os.path.join(os.path.join(os.path.dirname(__file__), "merged_files"),fileName)
153-
logging.info(f'File path:{merged_file_path}')
154151
file_name, pages = get_documents_from_file_by_path(merged_file_path,fileName)
155-
152+
156153
if pages==None or len(pages)==0:
157154
raise Exception(f'Pdf content is not available for file : {file_name}')
158155

@@ -294,15 +291,10 @@ def processing_source(graph, model, file_name, pages, allowedNodes, allowedRelat
294291
logging.info('Updated the nodeCount and relCount properties in Docuemnt node')
295292
logging.info(f'file:{file_name} extraction has been completed')
296293

294+
295+
# merged_file_path have value only when file uploaded from local
297296
if merged_file_path is not None:
298-
file_path = Path(merged_file_path)
299-
if file_path.exists():
300-
file_path.unlink()
301-
logging.info(f'file {file_name} delete successfully')
302-
else:
303-
logging.info(f'file {file_name} does not exist')
304-
else:
305-
logging.info(f'File Path is None i.e. source type other than local file')
297+
delete_uploaded_local_file(merged_file_path, file_name)
306298

307299
return {
308300
"fileName": file_name,
@@ -355,27 +347,25 @@ def connection_check(graph):
355347
graph_DB_dataAccess = graphDBdataAccess(graph)
356348
return graph_DB_dataAccess.connection_check()
357349

358-
def merge_chunks(file_name, total_chunks):
359-
360-
chunk_dir = os.path.join(os.path.dirname(__file__), "chunks")
361-
merged_file_path = os.path.join(os.path.dirname(__file__), "merged_files")
350+
def merge_chunks(file_name, total_chunks, chunk_dir, merged_dir):
362351

363-
if not os.path.exists(merged_file_path):
364-
os.mkdir(merged_file_path)
352+
if not os.path.exists(merged_dir):
353+
os.mkdir(merged_dir)
365354

366-
with open(os.path.join(merged_file_path, file_name), "wb") as write_stream:
355+
with open(os.path.join(merged_dir, file_name), "wb") as write_stream:
367356
for i in range(1,total_chunks+1):
368357
chunk_file_path = os.path.join(chunk_dir, f"{file_name}_part_{i}")
369358
with open(chunk_file_path, "rb") as chunk_file:
370359
shutil.copyfileobj(chunk_file, write_stream)
371360
os.unlink(chunk_file_path) # Delete the individual chunk file after merging
372361
logging.info("Chunks merged successfully and return file size")
373-
file_size = os.path.getsize(os.path.join(merged_file_path, file_name))
362+
file_size = os.path.getsize(os.path.join(merged_dir, file_name))
374363
return file_size
375364

376365

377-
def upload_file(graph, model, chunk, chunk_number:int, total_chunks:int, originalname):
378-
chunk_dir = os.path.join(os.path.dirname(__file__), "chunks") # Directory to save chunks
366+
367+
def upload_file(graph, model, chunk, chunk_number:int, total_chunks:int, originalname, chunk_dir, merged_dir):
368+
# chunk_dir = os.path.join(os.path.dirname(__file__), "chunks") # Directory to save chunks
379369
if not os.path.exists(chunk_dir):
380370
os.mkdir(chunk_dir)
381371

@@ -387,7 +377,7 @@ def upload_file(graph, model, chunk, chunk_number:int, total_chunks:int, origina
387377

388378
if int(chunk_number) == int(total_chunks):
389379
# If this is the last chunk, merge all chunks into a single file
390-
file_size = merge_chunks(originalname, int(total_chunks))
380+
file_size = merge_chunks(originalname, int(total_chunks), chunk_dir, merged_dir)
391381
logging.info("File merged successfully")
392382

393383
obj_source_node = sourceNode()

backend/src/shared/common_fn.py

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
from typing import List
1010
import re
1111
import os
12+
from pathlib import Path
1213

1314
def check_url_source(source_type, yt_url:str=None, queries_list:List[str]=None):
1415
try:
@@ -84,4 +85,9 @@ def load_embedding_model(embedding_model_name: str):
8485
def save_graphDocuments_in_neo4j(graph:Neo4jGraph, graph_document_list:List[GraphDocument]):
8586
# graph.add_graph_documents(graph_document_list, baseEntityLabel=True)
8687
graph.add_graph_documents(graph_document_list)
87-
88+
89+
def delete_uploaded_local_file(merged_file_path, file_name):
90+
file_path = Path(merged_file_path)
91+
if file_path.exists():
92+
file_path.unlink()
93+
logging.info(f'file {file_name} deleted successfully')

docker-compose.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ services:
3939
- ./frontend:/app
4040
- /app/node_modules
4141
environment:
42-
- BACKEND_API_URL=${BACKEND_API_URL-}
42+
- BACKEND_API_URL=${BACKEND_API_URL}
4343
- BLOOM_URL=${BLOOM_URL}
4444
- REACT_APP_SOURCES=${REACT_APP_SOURCES}
4545
container_name: frontend

0 commit comments

Comments
 (0)