11
22# Evaluations
33
4+ #TODO: Open AI evals documentaiton: https://platform.openai.com/docs/guides/evals
5+
46## LLM Output Evaluator
57
68The ` evals ` script evaluates the outputs of Large Language Models (LLMs) and estimates the associated token usage and cost.
@@ -12,6 +14,8 @@ It supports batch evalaution via a configuration CSV and produces a detailed met
1214This script evaluates LLM outputs using the ` lighteval ` library:
1315https://huggingface.co/docs/lighteval/en/metric-list#automatic-metrics-for-generative-tasks
1416
17+ ##TODO: Use uv to execute scripts without manually manging enviornments https://docs.astral.sh/uv/guides/scripts/
18+
1519Ensure you have the ` lighteval ` library and any model SDKs (e.g., OpenAI) configured properly.
1620
1721
@@ -138,6 +142,37 @@ df_grouped = df_grouped.rename(columns={'formatted_chunk': 'concatenated_chunks'
138142df_grouped.to_csv('~/Desktop/formatted_chunks.csv', index=False)
139143```
140144
145+ ```
146+ echo 'export PATH="/Applications/Postgres.app/Contents/Versions/latest/bin:$PATH"' >> ~/.zshrc
147+ source ~/.zshrc
148+
149+ createdb backupDBBalancer07012025
150+ pg_restore -v -d backupDBBalancer07012025 ~/Downloads/backupDBBalancer07012025.sql
151+
152+ pip install psycopg2-binary
153+
154+ from sqlalchemy import create_engine
155+ import pandas as pd
156+
157+ # Alternative: Standard psycopg2 connection (if you get psycopg2 working)
158+ # engine = create_engine("postgresql://sahildshah@localhost:5432/backupDBBalancer07012025")
159+
160+ # Fixed the variable name (was "database query", now "query")
161+ query = "SELECT * FROM api_embeddings;"
162+
163+ # Execute the query and load into DataFrame
164+ df = pd.read_sql(query, engine)
165+
166+ df['formatted_chunk'] = df.apply(lambda row: f"ID: {row['chunk_number']} | CONTENT: {row['text']}", axis=1)
167+ # Ensure the chunks are joined in order of chunk_number by sorting the DataFrame before grouping and joining
168+ df = df.sort_values(by=['name', 'upload_file_id', 'chunk_number'])
169+ df_grouped = df.groupby(['name', 'upload_file_id'])['formatted_chunk'].apply(lambda chunks: "\n".join(chunks)).reset_index()
170+ df_grouped = df_grouped.rename(columns={'formatted_chunk': 'concatenated_chunks'})
171+ df_grouped.to_csv('~/Desktop/formatted_chunks.csv', index=False)
172+ ```
173+
174+
175+
141176- Path where the evaluation resuls will be saved
142177
143178import pandas as pd
0 commit comments