Skip to content

Commit 652f960

Browse files
Add UpCamp Agent Example (#7)
* Modified manifest to add needed prompts, updated readme to explain requirements, goals, and usage for this agente. Also modified the system prompt to accomplish the goal in a better way * Updated readme to improve readability --------- Co-authored-by: Matias Fornara <matias.fornara@abstracta.com.uy>
1 parent 5aa7aeb commit 652f960

File tree

17 files changed

+2310
-0
lines changed

17 files changed

+2310
-0
lines changed

agent-upcamp/Dockerfile

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
FROM python:3.12
2+
3+
RUN pip install poetry
4+
5+
WORKDIR /usr/src/app
6+
7+
COPY pyproject.toml poetry.lock ./
8+
9+
RUN poetry install
10+
11+
COPY gpt_agent ./gpt_agent
12+
COPY entrypoint.sh entrypoint.sh
13+
14+
COPY .env .env
15+
16+
ADD https://raw.githubusercontent.com/vishnubob/wait-for-it/master/wait-for-it.sh wait-for-it.sh
17+
RUN chmod +x wait-for-it.sh
18+
19+
ENTRYPOINT [ "./entrypoint.sh" ]
20+
21+
CMD ["poetry", "run", "python", "-m", "gpt_agent"]

agent-upcamp/README.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# UpCamp Agent
2+
3+
This is an example agent for **supporting non-experienced workers in their daily activities** based on the [agent-extended](../agent-extended/README.md), which integrates with OpenAI (or Azure OpenAI) and provides a similar basic experience to ChatGPT, including authentication, proper session handling, response streaming and transcripts support.
4+
5+
It is developed using the following:
6+
7+
* [FastAPI](https://fastapi.tiangolo.com/)
8+
* [LangChain](https://www.langchain.com/)
9+
* [Poetry](https://python-poetry.org/)
10+
11+
The agent is configured with a system prompt aimed to help workers with their daily tasks. This prompt can be modified to suit your needs by editing the system prompt variable in the environment's variables (for references check the `SYSTEM_PROMPT` variable in the [sample.env](./sample.env) file).
12+
13+
The agent is also provisioned with six prompts to facilitate the user interaction and foster its usage. You could add more by editing the prompts collection in the [manifest.json](./gpt_agent/assets/manifest.json), but bear in mind that if you want to use more advanced or larger prompts it could be a good idea to create a separate agent for a specific purpose and use the `SYSTEM_PROMPT` variable to instruct the agent.
14+
15+
## Use Cases
16+
### 1. Chat with GPT
17+
As mentioned before, you could use this agent to chat and iterate over problems like you would do with ChatGPT.
18+
### 2. Use Predefined Prompts
19+
You can access the prompts list by typing `/`, and selecting the prompt you want to use. All the prompts in this example have an input variable which makes the cursor automatically placed where the user should fill in the proper input.
20+
21+
![demo](./demo.gif)
22+
23+
## Other capabilities
24+
### Authentication
25+
26+
The browser extension provides support for [OpenID Connect](https://en.wikipedia.org/wiki/OpenID#OpenID_Connect_(OIDC)) authentication.
27+
28+
Including in `manifest.json` an `auth` section with the following properties will enable this functionality:
29+
30+
* `url`: the OpenID base url. Check [sample.env](./sample.env) for some examples.
31+
* `clientId`: the client ID registered in your OAuth Provider for the copilot.
32+
* `scopes`: the scopes required for your copilot. Check [sample.env](./sample.env) for some examples.
33+
34+
Provided [sample.env](./sample.env) includes configurations for using Keycloak or Microsoft Entra ID.
35+
36+
#### Microsoft Entra ID
37+
38+
1. Register the Chrome extension in Azure as described [here](https://learn.microsoft.com/en-us/entra/identity-platform/quickstart-register-app).
39+
E.g.: use `browser-copilot` as the name and `https://nnllgflhcpaigpehhmbdhpjpakmofemh.chromiumapp.org/` as the redirect URI (check the proper ID for the Chrome extension by accessing manage extension in Chrome.)
40+
Remember to enable user assignment and assign users that should be able to access the copilot.
41+
2. Register the backend agent (API) for the copilot in Azure as described [here](https://learn.microsoft.com/en-us/entra/identity-platform/quickstart-configure-app-expose-web-apis) and [here](https://learn.microsoft.com/en-us/entra/identity-platform/quickstart-configure-app-access-web-apis).
42+
E.g.: using `gpt-copilot` as the name
43+
Remember to expose the API and add a scope (E.g.: `Chat`).
44+
Also, remember to add the API to the extension (`browser-copilot`) app registration.
45+
3. Use the extension (`browser-copilot`) client ID and proper API scope (Eg: `api://2e990215-c550-468b-950e-3008832f3fbb/Ask openid profile`) in your `.env` file.
46+
47+
#### Google OAuth
48+
49+
To add Google auth, you can use Keycloak and configure Google as an ID Provider.
50+
51+
To do so with the provided Keycloak, **which should not be used for production scenarios**, you can go to [identity providers section in Keycloak admin console](http://localhost:8080/admin/master/console/#/browser-copilot/identity-providers), with `admin` `admin` credentials, select the `browser-copilot` realm, and then add the Google as provider configuring proper client ID and client secret obtained from Google.
52+
In Google, you will need to create OAuth credentials as described [here](https://developers.google.com/identity/protocols/oauth2/web-server#creatingcred) using the redirect URI you get from Keycloak Google provider registration page (eg: `http://localhost:8080/realms/browser-copilot/broker/google/endpoint`).
53+
54+
For the time being, we haven't found a generic solution that allows direct integration with Google Auth.
55+
[Here](https://stackoverflow.com/questions/60724690/using-google-oidc-with-code-flow-and-pkce) is an issue we have faced when trying it.
56+
Another issue we have faced is that using [Google's proposed solution for Chrome extensions](https://developer.chrome.com/docs/extensions/how-to/integrate/oauth) requires knowing the client ID before building and publishing the extension, which is not good to allow any user to be able to use their own Google OAuth config without having to rebuild the extension.
57+
If you have any ideas please let us know by creating an issue or discussion in this repository.

agent-upcamp/demo.gif

2.32 MB
Loading

agent-upcamp/entrypoint.sh

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
#!/bin/sh
2+
3+
set -e
4+
if [ -n "$OPENID_URL" ]; then
5+
SERVER="${OPENID_URL##http://}"
6+
SERVER="${SERVER%%/*}"
7+
/usr/src/app/wait-for-it.sh -t 60 "${SERVER}"
8+
fi
9+
10+
exec "$@"

agent-upcamp/gpt_agent/__init__.py

Whitespace-only changes.

agent-upcamp/gpt_agent/__main__.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
import sys
2+
3+
import dotenv
4+
import uvicorn
5+
6+
if __name__ == "__main__":
7+
dotenv.load_dotenv()
8+
uvicorn.run("gpt_agent.api:app", host="0.0.0.0", port=8001, log_level="info", reload=len(sys.argv) > 1)

agent-upcamp/gpt_agent/agent.py

Lines changed: 137 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,137 @@
1+
import asyncio
2+
import datetime
3+
import enum
4+
import logging
5+
import os
6+
from typing import List, AsyncIterator, Optional
7+
from pydantic import BaseModel
8+
9+
from langchain.agents import Tool, OpenAIFunctionsAgent, AgentExecutor
10+
from langchain.callbacks import AsyncIteratorCallbackHandler
11+
from langchain.memory import ConversationBufferMemory, FileChatMessageHistory
12+
from langchain.prompts import MessagesPlaceholder
13+
from langchain.schema import SystemMessage
14+
from langchain.tools import tool
15+
from langchain_community.chat_models import AzureChatOpenAI, ChatOpenAI
16+
from openai import OpenAI, AzureOpenAI
17+
18+
from gpt_agent.domain import Session
19+
from gpt_agent.file_system_repos import get_session_path
20+
21+
logging.getLogger("openai").level = logging.DEBUG
22+
23+
24+
# just a sample tool to showcase how you can create your own set of tools
25+
@tool
26+
def clock() -> str:
27+
"""gets the current time"""
28+
return str(datetime.datetime.now())
29+
30+
31+
class AgentAction(enum.Enum):
32+
MESSAGE = "message"
33+
CLICK = "click"
34+
FILL = "fill"
35+
GOTO = "goto"
36+
37+
38+
class AgentStep(BaseModel):
39+
action: AgentAction
40+
selector: Optional[str] = None
41+
value: Optional[str] = None
42+
43+
44+
class AgentFlow(BaseModel):
45+
steps: List[AgentStep]
46+
47+
@staticmethod
48+
def message(text: str) -> 'AgentFlow':
49+
return AgentFlow(steps=[AgentStep(action=AgentAction.MESSAGE, value=text)])
50+
51+
52+
# a sample tool to showcase how you can automate navigation in the browser
53+
@tool(return_direct=True)
54+
def contact_abstracta(full_name: str) -> str:
55+
"""navigates to abstracta.us and fills the contact form with the given full name"""
56+
return AgentFlow(steps=[
57+
AgentStep(action=AgentAction.GOTO, value='https://abstracta.us'),
58+
AgentStep(action=AgentAction.CLICK, selector='xpath://a[@href="./contact-us"]'),
59+
AgentStep(action=AgentAction.FILL, selector='#fullname', value=full_name),
60+
AgentStep(action=AgentAction.MESSAGE, value="I have filled the contact form with your name.")
61+
]).model_dump_json()
62+
63+
64+
class Agent:
65+
66+
def __init__(self, session: Session):
67+
self._session = session
68+
message_history = FileChatMessageHistory(get_session_path(session.id) + "/chat_history.json")
69+
self._memory = ConversationBufferMemory(memory_key="chat_history", chat_memory=message_history,
70+
return_messages=True)
71+
self._agent = self._build_agent(self._memory, [clock, contact_abstracta])
72+
73+
def _build_agent(self, memory: ConversationBufferMemory, tools: List[Tool]) -> AgentExecutor:
74+
llm = self._build_llm()
75+
prompt = OpenAIFunctionsAgent.create_prompt(
76+
system_message=SystemMessage(content=os.getenv("SYSTEM_PROMPT")),
77+
extra_prompt_messages=[MessagesPlaceholder(variable_name=memory.memory_key)],
78+
)
79+
agent = OpenAIFunctionsAgent(llm=llm, tools=tools, prompt=prompt)
80+
return AgentExecutor(
81+
agent=agent,
82+
tools=tools,
83+
memory=memory,
84+
verbose=True,
85+
return_intermediate_steps=False,
86+
max_iterations=int(os.getenv("AGENT_MAX_ITERATIONS", "3"))
87+
)
88+
89+
def _build_llm(self):
90+
temperature = float(os.getenv("TEMPERATURE"))
91+
base_url = os.getenv("OPENAI_API_BASE")
92+
if self._is_azure(base_url):
93+
return AzureChatOpenAI(deployment_name=os.getenv("AZURE_DEPLOYMENT_NAME"), temperature=temperature,
94+
verbose=True, streaming=True)
95+
else:
96+
return ChatOpenAI(model_name=os.getenv("MODEL_NAME"), temperature=temperature, verbose=True, streaming=True)
97+
98+
@staticmethod
99+
def _is_azure(base_url: str) -> bool:
100+
return base_url and ".openai.azure.com" in base_url
101+
102+
def start_session(self):
103+
self._memory.chat_memory.add_user_message("this is my locale: " + self._session.locales[0])
104+
105+
def transcript(self, audio_file_path: str) -> str:
106+
base_url = os.getenv("OPENAI_WHISPER_API_BASE", os.getenv("OPENAI_API_BASE"))
107+
api_key = os.getenv("OPENAI_WHISPER_API_KEY", os.getenv("OPENAI_API_KEY"))
108+
api_version = os.getenv("OPENAI_WHISPER_API_VERSION", os.getenv("OPENAI_API_VERSION"))
109+
deployment_name = os.getenv("AZURE_WHISPER_DEPLOYMENT_NAME", os.getenv("AZURE_DEPLOYMENT_NAME"))
110+
client = AzureOpenAI(azure_endpoint=base_url, api_version=api_version, api_key=api_key,
111+
azure_deployment=deployment_name) \
112+
if self._is_azure(base_url) else OpenAI(base_url=base_url, api_key=api_key)
113+
locale = self._session.locales[0]
114+
lang_separator_pos = locale.find("-")
115+
language = locale[0:lang_separator_pos] if lang_separator_pos >= 0 else locale
116+
ret = client.audio.transcriptions.create(model="whisper-1", file=open(audio_file_path, 'rb'),
117+
language=language)
118+
return ret.text
119+
120+
async def ask(self, question: str) -> AsyncIterator[AgentFlow | str]:
121+
callback = AsyncIteratorCallbackHandler()
122+
task = asyncio.create_task(self._agent.arun(input=question, callbacks=[callback]))
123+
resp = ""
124+
async for token in callback.aiter():
125+
resp += token
126+
yield token
127+
ret = await task
128+
# when using tools tokens are not passed to the callback handler, so we need to get the response directly from
129+
# agent run call
130+
if ret != resp:
131+
if ret.startswith("{\"steps\":"):
132+
try:
133+
yield AgentFlow.model_validate_json(ret)
134+
except Exception as e:
135+
logging.exception("Error parsing agent response", e)
136+
yield ret
137+
yield ret

agent-upcamp/gpt_agent/api.py

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
import logging
2+
import os
3+
import traceback
4+
from typing import AsyncIterator, Annotated, Optional
5+
6+
from fastapi import Depends, FastAPI, HTTPException, status, Request
7+
from fastapi.responses import FileResponse, StreamingResponse, Response
8+
from fastapi.templating import Jinja2Templates
9+
from pydantic import BaseModel
10+
from sse_starlette.sse import ServerSentEvent
11+
12+
from gpt_agent.agent import Agent, AgentAction
13+
from gpt_agent.auth import get_current_user
14+
from gpt_agent.domain import Session, Question, TranscriptionQuestion, SessionBase
15+
from gpt_agent.file_system_repos import SessionsRepository, QuestionsRepository, TranscriptionsRepository
16+
17+
logging.basicConfig()
18+
logger = logging.getLogger("gpt_agent")
19+
logger.level = logging.DEBUG
20+
logging.getLogger().level = logging.DEBUG
21+
22+
app = FastAPI()
23+
assets_path = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'assets')
24+
templates = Jinja2Templates(directory=assets_path)
25+
sessions_repo = SessionsRepository()
26+
questions_repo = QuestionsRepository()
27+
transcriptions_repo = TranscriptionsRepository()
28+
29+
30+
@app.get('/manifest.json')
31+
async def get_manifest(request: Request) -> Response:
32+
return templates.TemplateResponse("manifest.json", {
33+
"request": request,
34+
"openid_url": os.getenv("MANIFEST_OPENID_URL", os.getenv("OPENID_URL")),
35+
"openid_client_id": os.getenv("OPENID_CLIENT_ID"),
36+
"openid_scope": os.getenv("OPENID_SCOPE"),
37+
"contact_email": os.getenv("CONTACT_EMAIL")
38+
}, media_type='application/json')
39+
40+
41+
@app.get('/logo.png')
42+
async def get_logo() -> FileResponse:
43+
return FileResponse(os.path.join(assets_path, 'logo.png'))
44+
45+
46+
@app.post('/sessions', status_code=status.HTTP_201_CREATED)
47+
async def create_session(req: SessionBase, user: Annotated[str, Depends(get_current_user)]) -> Session:
48+
ret = Session(**req.model_dump(), user=user)
49+
await sessions_repo.save_session(ret)
50+
Agent(ret).start_session()
51+
return ret
52+
53+
54+
class QuestionRequest(BaseModel):
55+
question: Optional[str] = ""
56+
57+
58+
@app.post('/sessions/{session_id}/questions')
59+
async def answer_question(
60+
session_id: str, req: QuestionRequest, user: Annotated[str, Depends(get_current_user)]) -> StreamingResponse:
61+
session = await _find_session(session_id, user)
62+
# This copilot uses response streaming which allows users to start get a response as soon as
63+
# possible, which is particularly important when interacting with LLMs that support response
64+
# streaming and may take some time to end answering a given response.
65+
# If you don't want to use response streaming you can just return a pydantic object like in
66+
# create session endpoint.
67+
return StreamingResponse(agent_response_stream(req, session), media_type="text/event-stream")
68+
69+
70+
async def _find_session(session_id: str, user: str) -> Session:
71+
ret = await sessions_repo.find_session(session_id)
72+
if not ret or ret.user != user:
73+
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND,
74+
detail=f'session {session_id} not found')
75+
return ret
76+
77+
78+
async def agent_response_stream(req: QuestionRequest, session: Session) -> AsyncIterator[bytes]:
79+
try:
80+
answer_stream = Agent(session).ask(req.question)
81+
complete_answer = ""
82+
async for token in answer_stream:
83+
if isinstance(token, str):
84+
complete_answer = complete_answer + token
85+
yield ServerSentEvent(data=token).encode()
86+
else:
87+
complete_answer = complete_answer + token.model_dump_json()
88+
yield ServerSentEvent(event="flow", data=token.model_dump_json()).encode()
89+
ret = Question(question=req.question, answer=complete_answer, session=session)
90+
await questions_repo.save_question(ret)
91+
except Exception as e:
92+
traceback.print_exception(e)
93+
yield ServerSentEvent(event="error").encode()
94+
95+
96+
class TranscriptionRequest(BaseModel):
97+
file: Optional[str] = ""
98+
99+
100+
class TranscriptionResponse(BaseModel):
101+
text: str
102+
103+
104+
@app.post('/sessions/{session_id}/transcriptions')
105+
async def answer_transcription(session_id: str, req: TranscriptionRequest, user: Annotated[str, Depends(get_current_user)]) -> TranscriptionResponse:
106+
session = await _find_session(session_id, user)
107+
ret = TranscriptionQuestion(base64=req.file, session=session)
108+
audio_file_path = await transcriptions_repo.save_audio(ret)
109+
text = Agent(session).transcript(audio_file_path)
110+
return TranscriptionResponse(text=text)
1.94 KB
Loading
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
{
2+
"id": "upcamp",
3+
"name": "UpCamp Agent",
4+
{% if openid_url %}
5+
"auth": {
6+
"url": "{{ openid_url }}",
7+
"clientId": "{{ openid_client_id }}",
8+
"scope": "{{ openid_scope }}"
9+
},
10+
{% endif %}
11+
"capabilities": [ "transcripts" ],
12+
"welcomeMessage": "¡Hola! Soy tu asistente de UpCamp.\n\nPuedo ayudarte a mejorar código, documentar APIs, explicar conceptos técnicos, generar casos de prueba, analizar errores y superar obstáculos en tus proyectos.\n\nRecuerda que tipeando / puedes acceder a los prompts \n\n ¿En qué te puedo asistir hoy?",
13+
"prompts": [
14+
{ "name" : "Mejorar Código", "text" : "Explícame este fragmento de código y cómo podría mejorarlo: ${input}" },
15+
{ "name" : "Documentación API", "text" : "Necesito crear una documentación clara para esta API: ${input}. ¿Puedes ayudarme con una estructura y ejemplos?" },
16+
{ "name" : "Explicar Concepto", "text" : "¿Puedes explicarme de manera simple cómo funciona ${input} y darme un ejemplo práctico?" },
17+
{ "name" : "Generar Casos de Prueba", "text" : "Necesito escribir casos de prueba para esta función : ${input}. ¿Puedes ayudarme con ejemplos de casos de prueba que cubran diferentes escenarios utilizando la tecnica de clases de equivelencia?" },
18+
{ "name" : "Analizar Error", "text" : "Estoy recibiendo este error : ${input}. ¿Puedes ayudarme a entender qué significa y cómo solucionarlo?" },
19+
{ "name" : "Ayuda con Bloqueos", "text" : "Estoy atascado tratando de implementar ${input} con el enfoque actual que tengo. ¿Qué alternativas o mejores prácticas me sugieres?" }
20+
],
21+
"contactEmail": "{{ contact_email }}"
22+
}

0 commit comments

Comments
 (0)