California Procurement Agent

🎯 About

California Procurement Agent is an intelligent analytical platform designed to explore and analyze California's state procurement data from the eSCPRS (Electronic California Procurement Reporting System). The system combines Retrieval-Augmented Generation (RAG) technology with advanced data visualization to provide insights into government spending patterns, supplier analysis, and procurement trends.

The platform features an AI-powered chat interface that allows users to ask natural language questions about procurement data and receive detailed text-based analysis, combined with comprehensive data exploration notebooks and interactive visualizations. The system processes real procurement data including purchase orders, supplier information, spending categories, and acquisition methods.

Key Data Insights:

Analysis of $X+ in state procurement spending
Examination of supplier diversity and qualification programs
Contract vs non-contract spending patterns
Department-wise spending analysis
CalCard (state credit card) usage patterns
Geographic distribution of procurement activities

Kaggle Notebook: For a detailed exploratory data analysis, check out our comprehensive notebook: California State Procurement EDA

✨ Features

AI-Powered Chat Interface:

Natural Language Queries: Ask questions about procurement data in plain English
Intelligent Agent: LangChain-powered agent with MongoDB query tools for complex data analysis
Intelligent Responses: Get contextual answers with proper formatting and data insights
Chat History: Save and manage conversation history with procurement analysis

Data Exploration & Visualization:

Interactive Charts: Plotly-powered visualizations for spending patterns, supplier analysis, and trends (available in Jupyter notebook)
Comprehensive EDA: Jupyter notebook with statistical analysis and data insights

Procurement Data Analysis:

Supplier Analysis: Identify top suppliers, qualification status, and diversity metrics
Spending Patterns: Analyze contract vs non-contract spending, CalCard usage
Department Insights: Compare spending across different state departments
Acquisition Methods: Understand procurement methods and their distribution

Technical Features:

MongoDB Integration: Efficient storage and querying of large procurement datasets
Data Normalization: Consistent field naming and data type handling
API Endpoints: RESTful APIs for data access and analysis
Modern UI: React-based frontend with responsive design

🚀 Technologies

The following tools and frameworks were used in this project:

Backend:
- FastAPI - High-performance web framework for APIs
- MongoDB - NoSQL database for procurement data storage
- Google Gemini - AI model for natural language processing
- OpenAI - AI model for text generation and understanding
- LangChain - Framework for LLM applications and agent orchestration
- Pandas - Data manipulation and analysis
- Plotly - Interactive data visualization
- Matplotlib - Static visualizations
- Seaborn - Statistical data visualization
Frontend:
- React - UI library for modern web applications
- Vite - Fast build tool and development server
- Axios - HTTP client for API communication
Data Analysis & Visualization:
- Plotly - Interactive charts and graphs
- Matplotlib - Static visualizations
- Seaborn - Statistical data visualization
- Jupyter - Interactive notebooks for data exploration
Data Processing:
- Kaggle API - Dataset downloading and management
- Custom data normalization and cleaning pipelines

Architecture

The application follows a modern data analysis architecture:

Data Layer:
- MongoDB for storing normalized procurement data and chat history and user sessions
Processing Layer:
- Data loading and normalization from Kaggle datasets
- Field standardization and type conversion
- Indexing for efficient querying
Analysis Layer:
- Pandas-based data manipulation and analysis
- Statistical computations and aggregations
- Time-series analysis for procurement trends
AI Layer:
- LangChain-powered RAG system with intelligent agent
- Google Gemini for natural language understanding
- MongoDB query tools for data retrieval and analysis
- Custom prompts for procurement-specific queries
Visualization Layer:
- Plotly for interactive web-based charts
- Matplotlib/Seaborn for static analysis
- Jupyter notebooks for exploratory analysis
API Layer:
- FastAPI for RESTful endpoints
- Chat management and data querying APIs
Frontend Layer:
- React-based chat interface with text-based AI responses
- Data visualization available through Jupyter notebooks
- Responsive design for multiple devices

Pipeline

The procurement data analysis pipeline includes:

Data Acquisition: Download and load California procurement datasets from Kaggle
Data Normalization: Standardize field names, convert data types, handle missing values
Database Storage: Store processed data in MongoDB for efficient querying
Analysis Engine: Generate insights, statistics, and visualizations
AI Integration: Enable natural language queries about procurement data
User Interface: Provide chat interface and data exploration tools

✅ Requirements

Before starting, ensure you have the following installed:

Python 3.11+
Node.js 16+ and npm
MongoDB (local or cloud instance)
Google Gemini API key
Kaggle API credentials (optional, for data updates)

🏁 Starting

# Clone this project
$ git clone https://github.com/romanyn36/california-procurement-agent.git

# Navigate to the project directory
$ cd california-procurement-agent

# Create a virtual environment i use uv package manger
$ uv sync

# Activate the virtual environment
$ source .venv/bin/activate  # For Linux/Mac
$ .\.venv\Scripts\activate    # For Windows

# Set up environment variables
$ cp .env.example .env
# Edit .env with your API keys:
# MONGODB_URI=mongodb://localhost:27017/
# OPENAI_API_KEY=your_openai_api_key
# GEMINI_API_KEY=your_gemini_api_key
# KAGGLE_USERNAME=your_kaggle_username
# KAGGLE_KEY=your_kaggle_key

# Load the procurement data
# and populate the MongoDB database
$ python -m database.data_loader

# Start the backend server
$ python -m uvicorn app:app --reload --host 0.0.0.0 --port 8000

# In a separate terminal, navigate to the frontend directory
$ cd agent-frontend

# Install frontend dependencies
$ npm install

# Start the development server
$ npm run dev

# Access the application:
# Frontend: http://localhost:5173
# Backend API: http://localhost:8000
# Data Exploration: Open data_exploring.ipynb in Jupyter

Data Exploration

For detailed data analysis, visit my Kaggle notebook: California State Procurement EDA

The notebook includes:

Data loading and preprocessing
Statistical analysis
Interactive visualizations
Procurement insights and trends

💬 Example Queries

tell me about the order with purchase order number REQ0011118
Show me all orders for laptops or computers
Who are the top 10 suppliers by total spending
How many purchase orders were created in 2013
What is the total spending across all departments?
What are the top 5 departments by number of orders?
Which suppliers have DVBE certification?

Configuration

Key configuration files:

.env: API keys and database connections
prompt_template.py: AI model prompts and field mappings
database/mongodb_tools.py: Database query functions
agent.py: LangChain agent configuration with MongoDB query tools

🚧 What's Next?

Future enhancements planned:

Chat Visualizations: Add interactive charts and graphs directly in the chat interface to visualize procurement data insights alongside text responses
Advanced Analytics: Machine learning models for spending prediction

📝 License

This project is licensed under the MIT License. For more details, see the LICENSE file.

❤️ Contact Me

Made by Romani – an AI Engineer and Backend Developer. Feel free to reach out for collaborations, questions, or new projects! You can contact me via email: contact@romaninasrat.com
You can also find me on:

Back to top

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
agent-frontend		agent-frontend
database		database
images		images
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
agent.py		agent.py
app.py		app.py
prompt_template.py		prompt_template.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

California Procurement Agent

🎯 About

✨ Features

AI-Powered Chat Interface:

Data Exploration & Visualization:

Procurement Data Analysis:

Technical Features:

🚀 Technologies

Architecture

Pipeline

✅ Requirements

🏁 Starting

Data Exploration

💬 Example Queries

Configuration

🚧 What's Next?

📝 License

❤️ Contact Me

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

California Procurement Agent

🎯 About

✨ Features

AI-Powered Chat Interface:

Data Exploration & Visualization:

Procurement Data Analysis:

Technical Features:

🚀 Technologies

Architecture

Pipeline

✅ Requirements

🏁 Starting

Data Exploration

💬 Example Queries

Configuration

🚧 What's Next?

📝 License

❤️ Contact Me

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages