oracle-devrel
diff --git a/‎ai-and-app-modernisation/ai-services/generative-ai-service/README.md
Lines changed: 3 additions & 0 deletions b/‎ai-and-app-modernisation/ai-services/generative-ai-service/README.md
Lines changed: 3 additions & 0 deletions
diff --git a/‎ai-and-app-modernisation/ai-services/generative-ai-service/rag-genai/files/LangChainRAG.py
Lines changed: 21 additions & 0 deletions b/‎ai-and-app-modernisation/ai-services/generative-ai-service/rag-genai/files/LangChainRAG.py
Lines changed: 21 additions & 0 deletions
diff --git a/‎ai-and-app-modernisation/ai-services/generative-ai-service/rag-genai/files/README.md
Lines changed: 13 additions & 16 deletions b/‎ai-and-app-modernisation/ai-services/generative-ai-service/rag-genai/files/README.md
Lines changed: 13 additions & 16 deletions
diff --git a/‎ai-and-app-modernisation/ai-services/generative-ai-service/rag-genai/files/langChainRagWithUI.py
Lines changed: 42 additions & 0 deletions b/‎ai-and-app-modernisation/ai-services/generative-ai-service/rag-genai/files/langChainRagWithUI.py
Lines changed: 42 additions & 0 deletions
diff --git a/‎ai-and-app-modernisation/ai-services/generative-ai-service/summarize-genai/README.md
Lines changed: 24 additions & 0 deletions b/‎ai-and-app-modernisation/ai-services/generative-ai-service/summarize-genai/README.md
Lines changed: 24 additions & 0 deletions
diff --git a/‎ai-and-app-modernisation/ai-services/generative-ai-service/summarize-genai/files/README.md
Lines changed: 28 additions & 0 deletions b/‎ai-and-app-modernisation/ai-services/generative-ai-service/summarize-genai/files/README.md
Lines changed: 28 additions & 0 deletions
diff --git a/‎ai-and-app-modernisation/ai-services/generative-ai-service/summarize-genai/files/docSummarize.png
107 KB b/‎ai-and-app-modernisation/ai-services/generative-ai-service/summarize-genai/files/docSummarize.png
107 KB
diff --git a/‎ai-and-app-modernisation/ai-services/generative-ai-service/summarize-genai/files/ocidocumentSummarizeUpload.py
Lines changed: 166 additions & 0 deletions b/‎ai-and-app-modernisation/ai-services/generative-ai-service/summarize-genai/files/ocidocumentSummarizeUpload.py
Lines changed: 166 additions & 0 deletions
diff --git a/‎ai-and-app-modernisation/ai-services/generative-ai-service/summarize-genai/files/requirements.txt
Lines changed: 6 additions & 0 deletions b/‎ai-and-app-modernisation/ai-services/generative-ai-service/summarize-genai/files/requirements.txt
Lines changed: 6 additions & 0 deletions
diff --git a/‎ai-and-app-modernisation/app-integration-and-automation/shared-assets/README.md
Lines changed: 4 additions & 0 deletions b/‎ai-and-app-modernisation/app-integration-and-automation/shared-assets/README.md
Lines changed: 4 additions & 0 deletions
@@ -11,6 +11,9 @@ Reviewed: 30.01.2024
 
 # Team Publications
 
+- [Enable a Low Code Modular LLM App Engine using Oracle Integration and OCI Generative AI](https://docs.oracle.com/en/solutions/oci-generative-ai-integration/index.html)
+    - This reference architecture lets you understand the necessary considerations and recommendations to enable an AI-based, modular and event-driven  LLM App Engine using a low-code approach with Oracle Integration as the LLM orchestrator, OCI Generative AI and other OCI services
+    - Build enterprise-grade, modular, scalable, secure & maintainable LLM Apps
 - [Oracle Generative AI webinar](https://go.oracle.com/LP=138234?elqCampaignId=489428&src1=:so:ch:or:dg::::&SC=:so:ch:or:dg::::&pcode=WWMK230822P00010)
     - Deep dive into Oracle Generative AI platform
 - [Creating a RAG (Retrieval-Augmented Generation) with Oracle Generative AI Service in just 21 lines of code](https://github.com/oracle-devrel/technology-engineering/tree/main/ai-and-app-modernisation/ai-services/generative-ai-service/rag-genai)
 
@@ -0,0 +1,21 @@
+from langchain_community.embeddings import OCIGenAIEmbeddings
+from langchain.chains import RetrievalQA
+from langchain_community.vectorstores import Qdrant
+from langchain_core.prompts import PromptTemplate
+from langchain_community.llms import OCIGenAI
+from langchain_community.document_loaders import UnstructuredURLLoader
+compartment_id = "ocid1.compartment.oc1..aaaaaaaa7ggqkd4ptkeb7ugk6ipsl3gqjofhkr6yacluwj4fitf2ufrdm65q"
+embeddings = OCIGenAIEmbeddings(model_id="cohere.embed-english-light-v3.0",service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",compartment_id= compartment_id,)
+testurls = ['https://docs.oracle.com/iaas/odsaz/odsa-rotate-wallet.html', 'https://docs.oracle.com/iaas/odsaz/odsa-change-password.html', 'https://docs.oracle.com/iaas/odsaz/odsa-database-actions.html']
+loader = UnstructuredURLLoader(urls=testurls)
+docs = loader.load()
+vectorstore = Qdrant.from_documents(docs,embeddings,location=":memory:",prefer_grpc=False,collection_name="test_db")
+retriever = vectorstore.as_retriever()
+rag_prompt_template = """Answer the question based only on the following context:
+{context}
+Question: {question}"""
+rag_prompt = PromptTemplate.from_template(rag_prompt_template)
+llm = OCIGenAI(model_id="cohere.command",service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",compartment_id= compartment_id,model_kwargs={"temperature": 0, "max_tokens": 300})
+rag = RetrievalQA.from_chain_type(llm=llm,retriever=retriever,chain_type_kwargs={"prompt": rag_prompt,},)
+data = rag.invoke("What is rotate a wallet")
+print(data['result'])
@@ -3,30 +3,27 @@
 ## Introduction
 In this article, we'll explore how to create a Retrieval-Augmented Generation (RAG) model using Oracle Gen AI, llama index, Qdrant Vector Database, and SentenceTransformerEmbeddings. This 21-line code will allow you to scrape through web pages, use llama index for indexing, Oracle Generative AI Service for question generation, and Qdrant for vector indexing.
 
+Find below the code of building a RAG using llamaIndex with Oracle Generative AI Service.
+Also check the file LangChainRAG.py which allows you to create an application (implementing RAG) using Langchain and the file langChainRagWithUI.py which includes a UI build with Streamlit.
+
 <img src="./RagArchitecture.svg">
 </img>
 
-## Limited Availability
-
-Oracle Generative AI Service is in Limited Availability as of today when we are creating this repo.
-
-Customers can easily enter in the LA programs. To test these functionalities you need to enrol in the LA programs and install the proper versions of software libraries.
-
-Code and functionalities can change, as a result of changes and new features
-
 ## Prerequisites
 
 Before getting started, make sure you have the following installed:
 
 - Oracle Generative AI Service
-- llama index
-- qdrant client
+- Llama index
+- Langchain
+- Qdrant client
 - SentenceTransformerEmbeddings
 
 ## Setting up the Environment
 1. Install the required packages:
    ```bash
-   pip install oci==2.118.1+preview.1.1697 llama-index qdrant-client sentence-transformers
+   pip install -U langchain oci
+   pip install langchain llama-index qdrant-client sentence-transformers transformers
    ```
 
 ## Loading data
@@ -41,26 +38,26 @@ sitemap used : https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/frpj5kvxry
 ## Entire code
 
    ```bash
-   from genai_langchain_integration.langchain_oci import OCIGenAI
 from llama_index import VectorStoreIndex
 from llama_index import ServiceContext
 from llama_index.vector_stores.qdrant import QdrantVectorStore
 from llama_index.storage.storage_context import StorageContext
 from qdrant_client import qdrant_client
-from langchain.embeddings import SentenceTransformerEmbeddings
+from langchain_community.embeddings import SentenceTransformerEmbeddings
 from llama_hub.web.sitemap import SitemapReader
+from langchain_community.llms import OCIGenAI
 loader = SitemapReader()
-documents = loader.load_data(sitemap_url='https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/frpj5kvxryk1/b/thisIsThePlace/o/combined.xml')
+documents = loader.load_data(sitemap_url='https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/frpj5kvxryk1/b/thisIsThePlace/o/latest.xml')
 client = qdrant_client.QdrantClient(location=":memory:")
 embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
-llm = OCIGenAI(model_id="cohere.command",service_endpoint="https://generativeai.aiservice.us-chicago-1.oci.oraclecloud.com",compartment_id = "ocid1.tenancy.oc1..aaaaaaaa5hwtrus75rauufcfvtnjnz3mc4xm2bzibbigva2bw4ne7ezkvzha",temperature=0.0)
+llm = OCIGenAI(model_id="cohere.command",service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",model_kwargs={"temperature": 0.0, "max_tokens": 300},compartment_id = "ocid1.compartment.oc1..aaaaaaaa7ggqkd4ptkeb7ugk6ipsl3gqjofhkr6yacluwj4fitf2ufrdm65q")
 system_prompt="As a support engineer, your role is to leverage the information in the context provided. Your task is to respond to queries based strictly on the information available in the provided context. Do not create new information under any circumstances. Refrain from repeating yourself. Extract your response solely from the context mentioned above. If the context does not contain relevant information for the question, respond with 'How can I assist you with questions related to the document?"
 service_context = ServiceContext.from_defaults(llm=llm, chunk_size=1000, chunk_overlap=100, embed_model=embeddings,system_prompt=system_prompt)
 vector_store = QdrantVectorStore(client=client, collection_name="ansh")
 storage_context = StorageContext.from_defaults(vector_store=vector_store)
 index = VectorStoreIndex.from_documents(documents, storage_context=storage_context, service_context=service_context)
 query_engine = index.as_query_engine()
-response = query_engine.query("can i use OCI document understanding for files in french ?")
+response = query_engine.query('What is activity auditing report?')
 print(response)
    ```
 
 
@@ -0,0 +1,42 @@
+import streamlit as st
+from langchain_community.embeddings import OCIGenAIEmbeddings
+from langchain.chains import RetrievalQA
+from langchain_community.vectorstores import Qdrant
+from langchain_core.prompts import PromptTemplate
+from langchain_community.llms import OCIGenAI
+from langchain_community.document_loaders import UnstructuredURLLoader
+st.title("Oracle QA Chatbot")
+st.text_input("Ask a question:", key="question")  # Input field for questions
+# Data loading (outside any function)
+compartment_id = "ocid1.compartment.oc1..aaaaaaaa7ggqkd4ptkeb7ugk6ipsl3gqjofhkr6yacluwj4fitf2ufrdm65q"
+embeddings = OCIGenAIEmbeddings(model_id="cohere.embed-english-light-v3.0",service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",compartment_id=compartment_id,)
+testurls = ['https://docs.oracle.com/iaas/odsaz/odsa-rotate-wallet.html','https://docs.oracle.com/iaas/odsaz/odsa-change-password.html','https://docs.oracle.com/iaas/odsaz/odsa-database-actions.html',]
+# Cache the loaded documents (outside any function)
+@st.cache_data
+def load_documents():
+    docs = UnstructuredURLLoader(urls=testurls).load()
+    print("Loading data")
+    print(docs)
+    return docs  # Return the loaded documents
+docs = load_documents()
+vectorstore = Qdrant.from_documents(docs, embeddings, location=":memory:", prefer_grpc=False, collection_name="test_db")
+retriever = vectorstore.as_retriever()
+rag_prompt_template = """Answer the question based only on the following context:
+{context}
+Question: {question}"""
+rag_prompt = PromptTemplate.from_template(rag_prompt_template)
+llm = OCIGenAI(
+    model_id="cohere.command",
+    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
+    compartment_id=compartment_id,
+    model_kwargs={"temperature": 0, "max_tokens": 300},
+)
+rag = RetrievalQA.from_chain_type(llm=llm, retriever=retriever, chain_type_kwargs={"prompt": rag_prompt})
+# Answer generation when a question is asked
+if st.button("Get Answer"):
+    question = st.session_state.question
+    # Ensure correct access to cached documents
+    docs = load_documents()  # Call the cached function to retrieve documents
+    data = rag.invoke(question, context=docs)  # Pass documents as context
+    answer = data["result"]
+    st.write("Answer:", answer)
@@ -0,0 +1,24 @@
+# Document Summarization Using Oracle Generative AI
+
+Text summarization, a core NLP task, unlocks the ability to distill lengthy content into concise, informative summaries. Large Language Models (LLMs) serve as powerful tools for summarizing a wide array of texts, including news articles, research papers, and technical documents. However, summarizing large documents comes with its own set of challenges, necessitating the application of specialized summarization strategies to indexed content.
+
+In this article, we'll delve into the creation of a powerful document summarization solution leveraging Oracle Generative AI. Through the integration of Oracle Gen AI's advanced capabilities with cutting-edge technologies such as langchain. This codebase empowers users to effortlessly summarize extensive documents, harnessing the power of Oracle Generative AI Service.
+
+<img src="./files/docSummarize.png">
+</img>
+ 
+# When to use this asset?
+ 
+See the README document in the /files folder.
+ 
+# How to use this asset?
+ 
+See the README document in the /files folder.
+ 
+# License
+ 
+Copyright (c) 2024 Oracle and/or its affiliates.
+ 
+Licensed under the Universal Permissive License (UPL), Version 1.0.
+ 
+See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details.
@@ -0,0 +1,28 @@
+# Prerequisites
+You need the latest versions of LangChain and the OCI software developer kit (SDK). To install and upgrade these two Python packages, use the following command:
+
+pip install -U langchain oci
+pip install -r requirements. txt
+
+# Running the application
+
+You need to have your compartment id ready to use that
+
+just run the command to launch the application
+
+streamlit run ocidocumentSummarizeUpload.py
+
+# More Info Links
+
+How to run the application : https://www.youtube.com/watch?v=6A3KGyKy91Q&t=21s
+
+Different methods of sumarization : https://medium.com/@anshuman4luv/revolutionizing-document-summarization-innovative-methods-with-langchain-and-large-language-models-f12272c7e8cd
+
+
+# License
+ 
+Copyright (c) 2024 Oracle and/or its affiliates.
+ 
+Licensed under the Universal Permissive License (UPL), Version 1.0.
+ 
+See [LICENSE](https://github.com/oracle-devrel/technology-engineering/blob/main/LICENSE) for more details.
@@ -0,0 +1,166 @@
+#Author: Anshuman Panda
+import streamlit as st
+import os
+from langchain.document_loaders import PyPDFLoader
+from langchain.prompts import PromptTemplate
+from langchain.text_splitter import RecursiveCharacterTextSplitter
+from langchain.chains.summarize import load_summarize_chain
+from langchain_community.llms import OCIGenAI
+from pypdf import PdfReader
+from io import BytesIO
+from typing import Any, Dict, List
+import re
+from langchain.docstore.document import Document
+
+
+
+@st.cache_data
+def parse_pdf(file: BytesIO) -> List[str]:
+    pdf = PdfReader(file)
+    output = []
+    for page in pdf.pages:
+        text = page.extract_text()
+        # Merge hyphenated words
+        text = re.sub(r"(\w+)-\n(\w+)", r"\1\2", text)
+        # Fix newlines in the middle of sentences
+        text = re.sub(r"(?<!\n\s)\n(?!\s\n)", " ", text.strip())
+        # Remove multiple newlines
+        text = re.sub(r"\n\s*\n", "\n\n", text)
+        output.append(text)
+    return output
+
+@st.cache_data
+def text_to_docs(text: str,chunk_size,chunk_overlap) -> List[Document]:
+    """Converts a string or list of strings to a list of Documents
+    with metadata."""
+    print("I am here Ansh")
+    print(chunk_size)
+    print(chunk_overlap)
+    if isinstance(text, str):
+        # Take a single string as one page
+        text = [text]
+    page_docs = [Document(page_content=page) for page in text]
+
+    # Add page numbers as metadata
+    for i, doc in enumerate(page_docs):
+        doc.metadata["page"] = i + 1
+
+    # Ansh Split pages into chunks
+    doc_chunks = []
+
+    for doc in page_docs:
+        text_splitter = RecursiveCharacterTextSplitter(
+            chunk_size=chunk_size,
+            separators=["\n\n", "\n", ".", "!", "?", ",", " ", ""],
+            chunk_overlap=chunk_overlap,
+        )
+        chunks = text_splitter.split_text(doc.page_content)
+        for i, chunk in enumerate(chunks):
+            doc = Document(
+                page_content=chunk, metadata={"page": doc.metadata["page"], "chunk": i}
+            )
+            # Ansh Add sources a metadata
+            doc.metadata["source"] = f"{doc.metadata['page']}-{doc.metadata['chunk']}"
+            doc_chunks.append(doc)
+    return doc_chunks
+
+
+def custom_summary(docs, llm, custom_prompt, chain_type, num_summaries):
+    print("I am inside custom summary")
+    custom_prompt = custom_prompt + """:\n {text}"""
+    print("Ansh custom Prompt is ------>")
+    print(custom_prompt)
+    COMBINE_PROMPT = PromptTemplate(template=custom_prompt, input_variables = ["text"])
+    print("Ansh combine Prompt is ------>")
+    print(COMBINE_PROMPT)
+    MAP_PROMPT = PromptTemplate(template="Summarize:\n{text}", input_variables=["text"])
+    print("Ansh MAP_PROMPT Prompt is ------>")
+    print(MAP_PROMPT)
+    if chain_type == "map_reduce":
+        chain = load_summarize_chain(llm,chain_type=chain_type,
+                                     map_prompt=MAP_PROMPT,
+                                     combine_prompt=COMBINE_PROMPT)
+    else:
+        chain = load_summarize_chain(llm,chain_type=chain_type)
+    print("Chain is --->")
+    print(chain)
+    summaries = []
+    for i in range(num_summaries):
+        summary_output = chain({"input_documents": docs}, return_only_outputs=True)["output_text"]
+        print("Summaries------------->")
+        print(summary_output)
+        summaries.append(summary_output)
+    
+    return summaries
+
+
+def main():
+    st.set_page_config(layout="wide")
+    hide_streamlit_style = """
+            <style>
+            [data-testid="stToolbar"] {visibility: hidden !important;}
+            footer {visibility: hidden !important;}
+            </style>
+            """
+    st.markdown(hide_streamlit_style, unsafe_allow_html=True) 
+    st.title("Document Summarization App")
+    
+    llm_name = st.sidebar.selectbox("LLM",["cohere.command","meta.llama-2-70b-chat"])
+    
+    chain_type = st.sidebar.selectbox("Chain Type", ["map_reduce", "stuff", "refine"])
+    chunk_size = st.sidebar.slider("Chunk Size", min_value=20, max_value = 5000,
+                                   step=10, value=2000)
+    chunk_overlap = st.sidebar.slider("Chunk Overlap", min_value=5, max_value = 5000,
+                                   step=10, value=200)        
+    user_prompt = st.text_input("Enter the document summary prompt", value= "Compose a brief summary of this text. ")
+    temperature = st.sidebar.number_input("Set the GenAI Temperature",
+                                              min_value = 0.0,
+                                              max_value=1.0,
+                                              step=0.1,
+                                              value=0.5)
+    max_token = st.sidebar.slider("Max Output size", min_value=200, max_value = 1000,step=10, value=200) 
+    compartment_id = st.sidebar.text_input("Enter the compartment id", value= "")
+                                             
+    opt = "Upload-own-file"
+    pages = None
+    if opt == "Upload-own-file":
+        uploaded_file = st.file_uploader(
+        "**Upload a Pdf file :**",
+            type=["pdf"],
+            )
+        if uploaded_file:
+            if uploaded_file.name.endswith(".txt"):
+                doc = parse_txt(uploaded_file)
+            elif uploaded_file.name.endswith(".pdf"):
+                doc = parse_pdf(uploaded_file)
+            pages = text_to_docs(doc, chunk_size, chunk_overlap)
+            print("Pages are here")
+            print(pages)
+
+
+            page_holder = st.empty()
+            if pages:
+                print("Inside if PAges")
+                st.write("PDF loaded successfully")
+                with page_holder.expander("File Content", expanded=False):
+                    pages
+
+        
+                llm = OCIGenAI(
+    model_id=llm_name,
+    service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com",
+    compartment_id = compartment_id,
+    model_kwargs={"temperature": temperature, "max_tokens": max_token}
+)
+
+                if st.button("Summarize"):
+                    with st.spinner('Summarizing....'):
+                        result = custom_summary(pages, llm, user_prompt, chain_type, 1)
+                        st.write("Summary:")
+                    for summary in result:
+                        st.write(summary)
+            else:
+                st.warning("No file found. Upload a file to summarize!")
+            
+if __name__=="__main__":
+    main()
@@ -0,0 +1,6 @@
+streamlit
+langchain
+unstructured
+langchain_community
+pypdf
+transformers
@@ -6,6 +6,10 @@ This section contains various examples related to Application Integration: demo
 
 ## Architecture Center
 
+- [Enable a Low Code Modular LLM App Engine using Oracle Integration and OCI Generative AI](https://docs.oracle.com/en/solutions/oci-generative-ai-integration/index.html)
+    - This reference architecture lets you understand the necessary considerations and recommendations to enable an AI-based, modular and event-driven  LLM App Engine using a low-code approach with Oracle Integration as the LLM orchestrator, OCI Generative AI and other OCI services
+    - Build enterprise-grade, modular, scalable, secure & maintainable LLM Apps
+
 - [Enable multicloud integrations from Oracle Cloud ERP to Microsoft Azure SQL Database](https://docs.oracle.com/en/solutions/oci-multicloud-erp-azure/index.html)
     - Reference Architecture on the Oracle Architecture Center, which provides the necessary considerations and recommendations to enable a multicloud, event-driven, and no-code integration solution to receive real-time feeds from Oracle Cloud ERP and send those to a private Microsoft Azure SQL Database, leveraging a component Oracle Integration provides called the connectivity agent, to facilitate on-premises/multicloud integrations
 - [Implement message-level encryption in Oracle Integration Cloud using OCI Vault](https://docs.oracle.com/en/solutions/oic-message-level-encryption/index.html#GUID-5C843938-A470-4584-9048-4361025358C6)