abstracta
diff --git a/‎agent-upcamp/Dockerfile‎
Lines changed: 21 additions & 0 deletions b/‎agent-upcamp/Dockerfile‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎agent-upcamp/README.md‎
Lines changed: 57 additions & 0 deletions b/‎agent-upcamp/README.md‎
Lines changed: 57 additions & 0 deletions
diff --git a/‎agent-upcamp/demo.gif‎
2.32 MB b/‎agent-upcamp/demo.gif‎
2.32 MB
diff --git a/‎agent-upcamp/entrypoint.sh‎
Lines changed: 10 additions & 0 deletions b/‎agent-upcamp/entrypoint.sh‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎agent-upcamp/gpt_agent/__init__.py‎ b/‎agent-upcamp/gpt_agent/__init__.py‎
diff --git a/‎agent-upcamp/gpt_agent/__main__.py‎
Lines changed: 8 additions & 0 deletions b/‎agent-upcamp/gpt_agent/__main__.py‎
Lines changed: 8 additions & 0 deletions
diff --git a/‎agent-upcamp/gpt_agent/agent.py‎
Lines changed: 137 additions & 0 deletions b/‎agent-upcamp/gpt_agent/agent.py‎
Lines changed: 137 additions & 0 deletions
diff --git a/‎agent-upcamp/gpt_agent/api.py‎
Lines changed: 110 additions & 0 deletions b/‎agent-upcamp/gpt_agent/api.py‎
Lines changed: 110 additions & 0 deletions
diff --git a/‎agent-upcamp/gpt_agent/assets/logo.png‎
1.94 KB b/‎agent-upcamp/gpt_agent/assets/logo.png‎
1.94 KB
diff --git a/‎agent-upcamp/gpt_agent/assets/manifest.json‎
Lines changed: 22 additions & 0 deletions b/‎agent-upcamp/gpt_agent/assets/manifest.json‎
Lines changed: 22 additions & 0 deletions
@@ -0,0 +1,21 @@
+FROM python:3.12
+
+RUN pip install poetry
+
+WORKDIR /usr/src/app
+
+COPY pyproject.toml poetry.lock ./
+
+RUN poetry install
+
+COPY gpt_agent ./gpt_agent
+COPY entrypoint.sh entrypoint.sh
+
+COPY .env .env
+
+ADD https://raw.githubusercontent.com/vishnubob/wait-for-it/master/wait-for-it.sh wait-for-it.sh
+RUN chmod +x wait-for-it.sh
+
+ENTRYPOINT [ "./entrypoint.sh" ]
+
+CMD ["poetry", "run", "python", "-m", "gpt_agent"]
@@ -0,0 +1,57 @@
+# UpCamp Agent
+
+This is an example agent for **supporting non-experienced workers in their daily activities** based on the [agent-extended](../agent-extended/README.md), which integrates with OpenAI (or Azure OpenAI) and provides a similar basic experience to ChatGPT, including authentication, proper session handling, response streaming and transcripts support.
+
+It is developed using the following:
+
+* [FastAPI](https://fastapi.tiangolo.com/)
+* [LangChain](https://www.langchain.com/)
+* [Poetry](https://python-poetry.org/)
+
+The agent is configured with a system prompt aimed to help workers with their daily tasks. This prompt can be modified to suit your needs by editing the system prompt variable in the environment's variables (for references check the `SYSTEM_PROMPT` variable in the [sample.env](./sample.env) file).
+
+The agent is also provisioned with six prompts to facilitate the user interaction and foster its usage. You could add more by editing the prompts collection in the [manifest.json](./gpt_agent/assets/manifest.json), but bear in mind that if you want to use more advanced or larger prompts it could be a good idea to create a separate agent for a specific purpose and use the `SYSTEM_PROMPT` variable to instruct the agent.
+
+## Use Cases
+### 1. Chat with GPT
+As mentioned before, you could use this agent to chat and iterate over problems like you would do with ChatGPT.
+### 2. Use Predefined Prompts
+You can access the prompts list by typing `/`, and selecting the prompt you want to use. All the prompts in this example have an input variable which makes the cursor automatically placed where the user should fill in the proper input.
+
+![demo](./demo.gif)
+
+## Other capabilities
+### Authentication
+
+The browser extension provides support for [OpenID Connect](https://en.wikipedia.org/wiki/OpenID#OpenID_Connect_(OIDC)) authentication.
+
+Including in `manifest.json` an `auth` section with the following properties will enable this functionality:
+
+* `url`: the OpenID base url. Check [sample.env](./sample.env) for some examples.
+* `clientId`: the client ID registered in your OAuth Provider for the copilot.
+* `scopes`: the scopes required for your copilot. Check [sample.env](./sample.env) for some examples.
+
+Provided [sample.env](./sample.env) includes configurations for using Keycloak or Microsoft Entra ID.
+
+#### Microsoft Entra ID
+
+1. Register the Chrome extension in Azure as described [here](https://learn.microsoft.com/en-us/entra/identity-platform/quickstart-register-app).
+   E.g.: use `browser-copilot` as the name and `https://nnllgflhcpaigpehhmbdhpjpakmofemh.chromiumapp.org/` as the redirect URI (check the proper ID for the Chrome extension by accessing manage extension in Chrome.)
+   Remember to enable user assignment and assign users that should be able to access the copilot.
+2. Register the backend agent (API) for the copilot in Azure as described [here](https://learn.microsoft.com/en-us/entra/identity-platform/quickstart-configure-app-expose-web-apis) and [here](https://learn.microsoft.com/en-us/entra/identity-platform/quickstart-configure-app-access-web-apis).
+   E.g.: using `gpt-copilot` as the name
+   Remember to expose the API and add a scope (E.g.: `Chat`).
+   Also, remember to add the API to the extension (`browser-copilot`) app registration.
+3. Use the extension (`browser-copilot`) client ID and proper API scope (Eg: `api://2e990215-c550-468b-950e-3008832f3fbb/Ask openid profile`) in your `.env` file.
+
+#### Google OAuth
+
+To add Google auth, you can use Keycloak and configure Google as an ID Provider.
+
+To do so with the provided Keycloak, **which should not be used for production scenarios**, you can go to [identity providers section in Keycloak admin console](http://localhost:8080/admin/master/console/#/browser-copilot/identity-providers), with `admin` `admin` credentials, select the `browser-copilot` realm, and then add the Google as provider configuring proper client ID and client secret obtained from Google. 
+In Google, you will need to create OAuth credentials as described [here](https://developers.google.com/identity/protocols/oauth2/web-server#creatingcred) using the redirect URI you get from Keycloak Google provider registration page (eg: `http://localhost:8080/realms/browser-copilot/broker/google/endpoint`). 
+
+For the time being, we haven't found a generic solution that allows direct integration with Google Auth. 
+[Here](https://stackoverflow.com/questions/60724690/using-google-oidc-with-code-flow-and-pkce) is an issue we have faced when trying it. 
+Another issue we have faced is that using [Google's proposed solution for Chrome extensions](https://developer.chrome.com/docs/extensions/how-to/integrate/oauth) requires knowing the client ID before building and publishing the extension, which is not good to allow any user to be able to use their own Google OAuth config without having to rebuild the extension.
+If you have any ideas please let us know by creating an issue or discussion in this repository.
@@ -0,0 +1,10 @@
+#!/bin/sh
+
+set -e
+if [ -n "$OPENID_URL" ]; then
+  SERVER="${OPENID_URL##http://}"
+  SERVER="${SERVER%%/*}"
+  /usr/src/app/wait-for-it.sh -t 60 "${SERVER}"
+fi
+
+exec "$@"
@@ -0,0 +1,8 @@
+import sys
+
+import dotenv
+import uvicorn
+
+if __name__ == "__main__":
+    dotenv.load_dotenv()
+    uvicorn.run("gpt_agent.api:app", host="0.0.0.0", port=8001, log_level="info", reload=len(sys.argv) > 1)
@@ -0,0 +1,137 @@
+import asyncio
+import datetime
+import enum
+import logging
+import os
+from typing import List, AsyncIterator, Optional
+from pydantic import BaseModel
+
+from langchain.agents import Tool, OpenAIFunctionsAgent, AgentExecutor
+from langchain.callbacks import AsyncIteratorCallbackHandler
+from langchain.memory import ConversationBufferMemory, FileChatMessageHistory
+from langchain.prompts import MessagesPlaceholder
+from langchain.schema import SystemMessage
+from langchain.tools import tool
+from langchain_community.chat_models import AzureChatOpenAI, ChatOpenAI
+from openai import OpenAI, AzureOpenAI
+
+from gpt_agent.domain import Session
+from gpt_agent.file_system_repos import get_session_path
+
+logging.getLogger("openai").level = logging.DEBUG
+
+
+# just a sample tool to showcase how you can create your own set of tools
+@tool
+def clock() -> str:
+    """gets the current time"""
+    return str(datetime.datetime.now())
+
+
+class AgentAction(enum.Enum):
+    MESSAGE = "message"
+    CLICK = "click"
+    FILL = "fill"
+    GOTO = "goto"
+
+
+class AgentStep(BaseModel):
+    action: AgentAction
+    selector: Optional[str] = None
+    value: Optional[str] = None
+
+
+class AgentFlow(BaseModel):
+    steps: List[AgentStep]
+
+    @staticmethod
+    def message(text: str) -> 'AgentFlow':
+        return AgentFlow(steps=[AgentStep(action=AgentAction.MESSAGE, value=text)])
+
+
+# a sample tool to showcase how you can automate navigation in the browser
+@tool(return_direct=True)
+def contact_abstracta(full_name: str) -> str:
+    """navigates to abstracta.us and fills the contact form with the given full name"""
+    return AgentFlow(steps=[
+        AgentStep(action=AgentAction.GOTO, value='https://abstracta.us'),
+        AgentStep(action=AgentAction.CLICK, selector='xpath://a[@href="./contact-us"]'),
+        AgentStep(action=AgentAction.FILL, selector='#fullname', value=full_name),
+        AgentStep(action=AgentAction.MESSAGE, value="I have filled the contact form with your name.")
+    ]).model_dump_json()
+
+
+class Agent:
+
+    def __init__(self, session: Session):
+        self._session = session
+        message_history = FileChatMessageHistory(get_session_path(session.id) + "/chat_history.json")
+        self._memory = ConversationBufferMemory(memory_key="chat_history", chat_memory=message_history,
+                                                return_messages=True)
+        self._agent = self._build_agent(self._memory, [clock, contact_abstracta])
+
+    def _build_agent(self, memory: ConversationBufferMemory, tools: List[Tool]) -> AgentExecutor:
+        llm = self._build_llm()
+        prompt = OpenAIFunctionsAgent.create_prompt(
+            system_message=SystemMessage(content=os.getenv("SYSTEM_PROMPT")),
+            extra_prompt_messages=[MessagesPlaceholder(variable_name=memory.memory_key)],
+        )
+        agent = OpenAIFunctionsAgent(llm=llm, tools=tools, prompt=prompt)
+        return AgentExecutor(
+            agent=agent,
+            tools=tools,
+            memory=memory,
+            verbose=True,
+            return_intermediate_steps=False,
+            max_iterations=int(os.getenv("AGENT_MAX_ITERATIONS", "3"))
+        )
+
+    def _build_llm(self):
+        temperature = float(os.getenv("TEMPERATURE"))
+        base_url = os.getenv("OPENAI_API_BASE")
+        if self._is_azure(base_url):
+            return AzureChatOpenAI(deployment_name=os.getenv("AZURE_DEPLOYMENT_NAME"), temperature=temperature,
+                                   verbose=True, streaming=True)
+        else:
+            return ChatOpenAI(model_name=os.getenv("MODEL_NAME"), temperature=temperature, verbose=True, streaming=True)
+
+    @staticmethod
+    def _is_azure(base_url: str) -> bool:
+        return base_url and ".openai.azure.com" in base_url
+
+    def start_session(self):
+        self._memory.chat_memory.add_user_message("this is my locale: " + self._session.locales[0])
+
+    def transcript(self, audio_file_path: str) -> str:
+        base_url = os.getenv("OPENAI_WHISPER_API_BASE", os.getenv("OPENAI_API_BASE"))
+        api_key = os.getenv("OPENAI_WHISPER_API_KEY", os.getenv("OPENAI_API_KEY"))
+        api_version = os.getenv("OPENAI_WHISPER_API_VERSION", os.getenv("OPENAI_API_VERSION"))
+        deployment_name = os.getenv("AZURE_WHISPER_DEPLOYMENT_NAME", os.getenv("AZURE_DEPLOYMENT_NAME"))
+        client = AzureOpenAI(azure_endpoint=base_url, api_version=api_version, api_key=api_key,
+                             azure_deployment=deployment_name) \
+            if self._is_azure(base_url) else OpenAI(base_url=base_url, api_key=api_key)
+        locale = self._session.locales[0]
+        lang_separator_pos = locale.find("-")
+        language = locale[0:lang_separator_pos] if lang_separator_pos >= 0 else locale
+        ret = client.audio.transcriptions.create(model="whisper-1", file=open(audio_file_path, 'rb'),
+                                                 language=language)
+        return ret.text
+
+    async def ask(self, question: str) -> AsyncIterator[AgentFlow | str]:
+        callback = AsyncIteratorCallbackHandler()
+        task = asyncio.create_task(self._agent.arun(input=question, callbacks=[callback]))
+        resp = ""
+        async for token in callback.aiter():
+            resp += token
+            yield token
+        ret = await task
+        # when using tools tokens are not passed to the callback handler, so we need to get the response directly from
+        # agent run call
+        if ret != resp:
+            if ret.startswith("{\"steps\":"):
+                try:
+                    yield AgentFlow.model_validate_json(ret)
+                except Exception as e:
+                    logging.exception("Error parsing agent response", e)
+                    yield ret
+            yield ret
@@ -0,0 +1,110 @@
+import logging
+import os
+import traceback
+from typing import AsyncIterator, Annotated, Optional
+
+from fastapi import Depends, FastAPI, HTTPException, status, Request
+from fastapi.responses import FileResponse, StreamingResponse, Response
+from fastapi.templating import Jinja2Templates
+from pydantic import BaseModel
+from sse_starlette.sse import ServerSentEvent
+
+from gpt_agent.agent import Agent, AgentAction
+from gpt_agent.auth import get_current_user
+from gpt_agent.domain import Session, Question, TranscriptionQuestion, SessionBase
+from gpt_agent.file_system_repos import SessionsRepository, QuestionsRepository, TranscriptionsRepository
+
+logging.basicConfig()
+logger = logging.getLogger("gpt_agent")
+logger.level = logging.DEBUG
+logging.getLogger().level = logging.DEBUG
+
+app = FastAPI()
+assets_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'assets')
+templates = Jinja2Templates(directory=assets_path)
+sessions_repo = SessionsRepository()
+questions_repo = QuestionsRepository()
+transcriptions_repo = TranscriptionsRepository()
+
+
+@app.get('/manifest.json')
+async def get_manifest(request: Request) -> Response:
+    return templates.TemplateResponse("manifest.json", {
+        "request": request,
+        "openid_url": os.getenv("MANIFEST_OPENID_URL", os.getenv("OPENID_URL")),
+        "openid_client_id": os.getenv("OPENID_CLIENT_ID"),
+        "openid_scope": os.getenv("OPENID_SCOPE"),
+        "contact_email": os.getenv("CONTACT_EMAIL")
+    }, media_type='application/json')
+
+
+@app.get('/logo.png')
+async def get_logo() -> FileResponse:
+    return FileResponse(os.path.join(assets_path, 'logo.png'))
+
+
+@app.post('/sessions', status_code=status.HTTP_201_CREATED)
+async def create_session(req: SessionBase, user: Annotated[str, Depends(get_current_user)]) -> Session:
+    ret = Session(**req.model_dump(), user=user)
+    await sessions_repo.save_session(ret)
+    Agent(ret).start_session()
+    return ret
+
+
+class QuestionRequest(BaseModel):
+    question: Optional[str] = ""
+
+
+@app.post('/sessions/{session_id}/questions')
+async def answer_question(
+        session_id: str, req: QuestionRequest, user: Annotated[str, Depends(get_current_user)]) -> StreamingResponse:
+    session = await _find_session(session_id, user)
+    # This copilot uses response streaming which allows users to start get a response as soon as
+    # possible, which is particularly important when interacting with LLMs that support response
+    # streaming and may take some time to end answering a given response.
+    # If you don't want to use response streaming you can just return a pydantic object like in
+    # create session endpoint.
+    return StreamingResponse(agent_response_stream(req, session), media_type="text/event-stream")
+
+
+async def _find_session(session_id: str, user: str) -> Session:
+    ret = await sessions_repo.find_session(session_id)
+    if not ret or ret.user != user:
+        raise HTTPException(status_code=status.HTTP_404_NOT_FOUND,
+                            detail=f'session {session_id} not found')
+    return ret
+
+
+async def agent_response_stream(req: QuestionRequest, session: Session) -> AsyncIterator[bytes]:
+    try:
+        answer_stream = Agent(session).ask(req.question)
+        complete_answer = ""
+        async for token in answer_stream:
+            if isinstance(token, str):
+                complete_answer = complete_answer + token
+                yield ServerSentEvent(data=token).encode()
+            else:
+                complete_answer = complete_answer + token.model_dump_json()
+                yield ServerSentEvent(event="flow", data=token.model_dump_json()).encode()
+        ret = Question(question=req.question, answer=complete_answer, session=session)
+        await questions_repo.save_question(ret)
+    except Exception as e:
+        traceback.print_exception(e)
+        yield ServerSentEvent(event="error").encode()
+
+
+class TranscriptionRequest(BaseModel):
+    file: Optional[str] = ""
+
+
+class TranscriptionResponse(BaseModel):
+    text: str
+
+
+@app.post('/sessions/{session_id}/transcriptions')
+async def answer_transcription(session_id: str, req: TranscriptionRequest, user: Annotated[str, Depends(get_current_user)]) -> TranscriptionResponse:
+    session = await _find_session(session_id, user)
+    ret = TranscriptionQuestion(base64=req.file, session=session)
+    audio_file_path = await transcriptions_repo.save_audio(ret)
+    text = Agent(session).transcript(audio_file_path)
+    return TranscriptionResponse(text=text)
@@ -0,0 +1,22 @@
+{
+  "id": "upcamp",
+  "name": "UpCamp Agent",
+  {% if openid_url %}
+  "auth": {
+    "url": "{{ openid_url }}",
+    "clientId": "{{ openid_client_id }}",
+    "scope": "{{ openid_scope }}"
+  },
+  {% endif %}
+  "capabilities": [ "transcripts" ],
+  "welcomeMessage": "¡Hola! Soy tu asistente de UpCamp.\n\nPuedo ayudarte a mejorar código, documentar APIs, explicar conceptos técnicos, generar casos de prueba, analizar errores y superar obstáculos en tus proyectos.\n\nRecuerda que tipeando / puedes acceder a los prompts \n\n ¿En qué te puedo asistir hoy?",
+  "prompts": [
+      { "name" : "Mejorar Código", "text" : "Explícame este fragmento de código y cómo podría mejorarlo: ${input}" },
+      { "name" : "Documentación API", "text" : "Necesito crear una documentación clara para esta API: ${input}. ¿Puedes ayudarme con una estructura y ejemplos?" },
+      { "name" : "Explicar Concepto", "text" : "¿Puedes explicarme de manera simple cómo funciona ${input} y darme un ejemplo práctico?" },
+      { "name" : "Generar Casos de Prueba", "text" : "Necesito escribir casos de prueba para esta función : ${input}. ¿Puedes ayudarme con ejemplos de casos de prueba que cubran diferentes escenarios utilizando la tecnica de clases de equivelencia?" },
+      { "name" : "Analizar Error", "text" : "Estoy recibiendo este error : ${input}. ¿Puedes ayudarme a entender qué significa y cómo solucionarlo?" },
+      { "name" : "Ayuda con Bloqueos", "text" : "Estoy atascado tratando de implementar ${input} con el enfoque actual que tengo. ¿Qué alternativas o mejores prácticas me sugieres?" }
+  ],
+  "contactEmail": "{{ contact_email }}"
+}