Skip to content

Commit 6e15fc9

Browse files
authored
Merge branch 'foss42:main' into main
2 parents c4bb335 + 8b12a72 commit 6e15fc9

File tree

14 files changed

+704
-106
lines changed

14 files changed

+704
-106
lines changed

doc/dev_guide/packaging.md

Lines changed: 78 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,84 @@ git push
7878

7979
## FlatHub (Flatpak)
8080

81-
TODO Instructions
81+
Steps to generate .flatpak package of API Dash:
82+
83+
1. Clone and build API Dash:
84+
85+
Follow the [How to run API Dash locally](setup_run.md) guide.
86+
87+
Stay in the root folder of the project directory.
88+
89+
2. Install Required Packages (Debian/Ubuntu):
90+
91+
```bash
92+
sudo apt install flatpak
93+
flatpak install -y flathub org.flatpak.Builder
94+
flatpak remote-add --if-not-exists --user flathub https://dl.flathub.org/repo/flathub.flatpakrepo
95+
```
96+
97+
*if using another linux distro, download flatpak and follow the rest of the steps.
98+
99+
3. Build API Dash project:
100+
101+
```bash
102+
flutter build linux --release
103+
```
104+
105+
4. Create flatpak manifest file:
106+
107+
```bash
108+
touch apidash-flatpak.yaml
109+
```
110+
in this file, add:
111+
112+
```yaml
113+
app-id: io.github.foss42.apidash
114+
runtime: org.freedesktop.Platform
115+
runtime-version: "23.08"
116+
sdk: org.freedesktop.Sdk
117+
118+
command: /app/bundle/apidash
119+
finish-args:
120+
- --share=ipc
121+
- --socket=fallback-x11
122+
- --socket=wayland
123+
- --device=dri
124+
- --socket=pulseaudio
125+
- --share=network
126+
- --filesystem=home
127+
modules:
128+
- name: apidash
129+
buildsystem: simple
130+
build-commands:
131+
- cp -a build/linux/x64/release/bundle /app/bundle
132+
sources:
133+
- type: dir
134+
path: .
135+
```
136+
137+
5. Create the .flatpak file:
138+
139+
```bash
140+
flatpak run org.flatpak.Builder --force-clean --sandbox --user --install --install-deps-from=flathub --ccache --mirror-screenshots-url=https://dl.flathub.org/media/ --repo=repo builddir apidash-flatpak.yaml
141+
142+
flatpak build-bundle repo apidash.flatpak io.github.foss42.apidash
143+
```
144+
145+
The apidash.flatpak file should be the project root folder.
146+
147+
To test it:
148+
149+
```bash
150+
flatpak install --user apidash.flatpak
151+
152+
flatpak run io.github.foss42.apidash
153+
```
154+
To uninstall it:
155+
156+
```bash
157+
flatpak uninstall io.github.foss42.apidash
158+
```
82159

83160
## Homebrew
84161

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# AI API Eval Framework For Multimodal Generative AI
2+
3+
## Personal Information
4+
- **Full Name:** Nideesh Bharath Kumar
5+
- **University Name:** Rutgers University–New Brunswick
6+
- **Program Enrolled In:** B.S. Computer Science, Artificial Intelligence Track
7+
- **Year:** Junior Year (Third Year)
8+
- **Expected Graduation Date:** May 2026
9+
10+
## About Me
11+
I’m **Nideesh Bharath Kumar**, a junior (third year) in Rutgers University–New Brunswick taking a **B.S. in Computer Science on the Artificial Intelligence Track**. I have a strong foundation in full stack development and AI engineering: I have project and internship experience in technologies like: **Dart/Flutter, LangChain, RAG, Vector Databases, AWS, Docker, Kubernetes, PostgreSQL, FastAPI, OAuth,** and other technologies that aid in developing scalable and AI-powered systems. I have interned at **Manomay Tech, IDEA, and Newark Science and Sustainability,** developing scalable systems and managing AI systems and completed fellowships with **Google** and **Codepath**, developing my technical skills. I’ve also won awards in hackathons, achieving **Overall Best Project in the CS Base Climate Hackathon for a Flutter-based project** and **Best Use of Terraform in the HackRU Hackathon for an Computer Vision Smart Shopping Cart**. I’m passionate about building distributed, scalable systems and AI technologies, and API Dash is an amazing tool that can facilitate in the process of building these solutions through easy visualization and testing of APIs; I believe my skills in **AI development** and experience with **Dart/Flutter** and **APIs** put me in a position to effectively contribute to this project.
12+
13+
## Project Details
14+
**Project Title:** AI API Eval Framework For Multimodal Generative AI
15+
**Description:**
16+
This project is to develop a **Dart-centered evaluation framework** designed to simplify the testing of generative AI models across **multiple types (text, image, code)**. This will be done by integrating evaluation toolkits: **llm-harness** for text, **torch-fidelity** and **CLIP** for images, and **HumanEval/MBPP** with **CodeBLEU** for code. This project will provide a unified config layer which can support standard and custom benchmark datasets and evaluation metrics. This will be done by providing a **user-friendly interface in API Dash** which allows the user to select model type, dataset management (local or downloadable), and evaluation metrics (standard toolkit or custom script). On top of this, **real-time visual analytics** will be provided to visualize the progress of the metrics as well as **parallelized batch processing** of the evaluation.
17+
18+
**Related Issue:** - [#618](https://github.com/foss42/apidash/issues/618)
19+
20+
**Key Features:**
21+
1) Unified Evaluation Configuration:
22+
- A config file in YAML will serve as the abstraction layer, which will be generated by the user's selection of model type, dataset, and evaluation metrics. This will redirect the config to either use llm-harness, torch-fidelity and CLIP, or HumanEval and MBPP with CodeBLEU. Additionally, custom evaluation scripts and datasets can be attached to this config file which can be interpreted by the systems.
23+
- This abstraction layer ensures that whether any of these specifications are different for the eval job, all of it will be redirected to the correct resources while still providing a centralized layer for creating the job. Furthermore, these config files can be stored in history for running the same jobs later.
24+
25+
2) Intuitive User Interface
26+
- When starting an evaluation, users can select the model type (text, image, or code) through a drop-down menu. The system will provide a list of standard datasets and use cases. The user can select these datasets, or attach a custom one. If the user does not have this dataset locally in the workspace, they can attach it using file explorer or download it from the web. Furthermore, the user can select standard evaluation metrics from a list or attach a custom script.
27+
28+
3) Standard Evaluation Pipelines
29+
- The standard evaluation pipelines include text, image, and code generation.
30+
- For text generation, llm-harness will be used, and utilize custom datasets and tasks to measure Precision, Recall, F1 Score, BLEU, ROUGE, and Perplexity. Custom integration of datasets and evaluation scores can be done through interfacing the llm-harness custom test config file.
31+
- For image generation, torch-fidelity can be used to calculate Fréchet Inception Distance and Inception Score by comparing against a reference image database. For text to image generation, CLIP scores can be used to ensure connection between prompt and generated image. Custom integration of datasets and evaluation scores can be done through a custom interface created using Dart.
32+
- For code generation, tests like HumanEval and MBPP can be used for functional correctness and CodeBLEU can be used for code quality checking. Custom integration will be done the same way as image generation, with a custom interface created using Dart for functional test databases and evaluation metrics.
33+
34+
4) Batch Evaluations
35+
- Parallel Processing will be supported by async runs of the tests, where a progress bar will monitor the number of processed rows in API Dash.
36+
37+
5) Visualizations of Results
38+
- Visualizations of results will be provided as the tests are running, providing live feedback of model performance, as well as a general summary of visualizations after all evals have been run.
39+
- Bar Graphs: These will be displayed from a range of 0 to 100% accuracy to visualize a quick performance comparison across all tested models.
40+
- Line Charts: These will be displayed to show performance trends over time of models, comparing model performance across different batches as well as between each model.
41+
- Tables: These will provide detailed summary statistics about scores for each model across different benchmarks and datasets.
42+
- Box Plots: These will show the distribution of scores per batch, highlighting outliers and variance, while also having side-by-side comparisons with different models.
43+
44+
6) Offline and Online Support
45+
- Offline: Models that are offline will be supported by pointing to the script the model uses to run, and datasets that are locally stored.
46+
- Online: These models can be connected for eval through an API endpoint, and datasets can be downloaded with access to the link.
47+
48+
**Architecture:**
49+
1) UI Interface: Built with Dart/Flutter
50+
2) Configuration Manager: Built with Dart, uses YAML for config file
51+
3) Dataset Manager: Built with Dart, REST APIs for accessing endpoints
52+
4) Evaluation Manager: Built with a Dart - Python layer to manage connections between evaluators and API Dash
53+
5) Batch Processing: Built with Dart Async requests
54+
6) Visualization and Results: Built with Dart/Flutter, using packages like fl_chart and syncfusion_flutter_charts
Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
### Initial Idea Submission
2+
3+
**Full Name:** April Lin
4+
**University Name:** University of Illinois Urbana-Champaign
5+
**Program (Degree & Major/Minor):** Master in Electrical and Computer Engineering
6+
**Year:** first year
7+
**Expected Graduation Date:** 2026
8+
9+
**Project Title:** API Explorer
10+
**Relevant Issues:** [https://github.com/foss42/apidash/issues/619](https://github.com/foss42/apidash/issues/619)
11+
12+
**Idea Description:**
13+
14+
I have divided the design of the API explorer into three major steps:
15+
16+
1. **Designing the UI**
17+
2. **Designing the API template model**
18+
3. **Using AI tools to automatically extract API information from a given website**
19+
20+
---
21+
22+
## 1. UI Design (User Journey)
23+
24+
In this step, I primarily designed two interfaces for the API explorer: the first is the main API Explorer interface, and the second is a detailed interface for each API template.
25+
26+
### API Explorer
27+
![api explorer](images/API_Explorer_Main.png)
28+
1. **Accessing the API Explorer**
29+
- In the left-hand sidebar, users will find an “API Explorer” icon.
30+
- Clicking this icon reveals the main API template search interface on the right side of the screen.
31+
32+
2. **Browsing API Templates**
33+
- At the top of the main area, there is a search bar that supports fuzzy matching by API name.
34+
- Directly beneath the search bar are category filters (e.g., AI, Finance, Web3, Social Media).
35+
- Users can click “More” to view an expanded list of all available categories.
36+
- The page displays each template in a **card layout**, showing the API’s name, a short description, and (optionally) an image or icon.
37+
38+
### API Templates
39+
![api explorer](images/API_Explorer_Template.png)
40+
1. **Selecting a Template**
41+
- When a user clicks on a card (for example, **OpenAI**), they navigate to a dedicated page for that API template.
42+
- This page lists all the available API endpoints or methods in a collapsible/expandable format (e.g., “API name 2,” “API name 3,” etc.).
43+
- Each listed endpoint describes what it does—users can select which methods they want to explore or import into their workspace.
44+
45+
2. **Exploring an API Method**
46+
- Within this detailed view, users see request details such as **HTTP method**, **path**, **headers**, **body**, and **sample response**.
47+
- If the user wants to try out an endpoint, they can import it into their API collections by clicking **import**.
48+
- Each method will include all the fields parsed through the automated process. For the detailed API field design, please refer to **Step Two**.
49+
50+
---
51+
52+
## 2. Updated Table Design
53+
54+
Below is the model design for the API explorer.
55+
56+
### **Base Table: `api_templates`**
57+
- **Purpose:**
58+
Stores the common properties for all API templates, regardless of their type.
59+
60+
- **Key Fields:**
61+
- **id**:
62+
- Primary key (integer or UUID) for unique identification.
63+
- **name**:
64+
- The API name (e.g., “OpenAI”).
65+
- **api_type**:
66+
- Enumerated string indicating the API type (e.g., `restful`, `graphql`, `soap`, `grpc`, `sse`, `websocket`).
67+
- **base_url**:
68+
- The base URL or service address (applicable for HTTP-based APIs and used as host:port for gRPC).
69+
- **image**:
70+
- A text or string field that references an image (URL or path) representing the API’s logo or icon.
71+
- **category**:
72+
- A field (array or string) used for search and classification (e.g., "finance", "ai", "devtool").
73+
- **description**:
74+
- Textual description of the API’s purpose and functionality.
75+
76+
### **RESTful & GraphQL Methods Table: `api_methods`**
77+
- **Purpose:**
78+
Manages detailed configurations for individual API requests/methods, specifically tailored for RESTful and GraphQL APIs.
79+
80+
- **Key Fields:**
81+
- **id**:
82+
- Primary key (UUID).
83+
- **template_id**:
84+
- Foreign key linking back to `api_templates`.
85+
- **method_type**:
86+
- The HTTP method (e.g., `GET`, `POST`, `PUT`, `DELETE`) or the operation type (`query`, `mutation` for GraphQL).
87+
- **method_name**:
88+
- A human-readable name for the method (e.g., “Get User List,” “Create Order”).
89+
- **url_path**:
90+
- The relative path appended to the `base_url` (for RESTful APIs).
91+
- **description**:
92+
- Detailed explanation of the method’s functionality.
93+
- **headers**:
94+
- A JSON field storing default header configurations (e.g., `Content-Type`, `Authorization`).
95+
- **authentication**:
96+
- A JSON field for storing default authentication details (e.g., Bearer Token, Basic Auth).
97+
- **query_params**:
98+
- A JSON field for any default query parameters (optional, typically for RESTful requests).
99+
- **body**:
100+
- A JSON field containing the default request payload, including required fields and default values.
101+
- **sample_response**:
102+
- A JSON field providing an example of the expected response for testing/validation.
103+
104+
---
105+
106+
## 3. Automated Extraction (Parser Design)
107+
108+
I think there are two ways to design the automated pipeline: the first is to use AI tools for automated parsing, and the second is to employ a rule-based approach.
109+
110+
### **AI-Based Parser**
111+
- For each parser type (OpenAPI, HTML, Markdown), design a dedicated prompt agent to parse the API methods.
112+
- The prompt includes model fields (matching the data structures from [Step Two](#2-updated-table-design)) and the required API category, along with the API URL to be parsed.
113+
- The AI model is instructed to output the parsed result in **JSON format**, aligned with the schema defined in `api_templates` and `api_methods`.
114+
115+
### **Non-AI (Rule-Based) Parser**
116+
- **OpenAPI**: Use existing libraries (e.g., Swagger/OpenAPI parser libraries) to read and interpret JSON or YAML specs.
117+
- **HTML**: Perform DOM-based parsing or use regex patterns to identify endpoints, parameter names, and descriptions.
118+
- **Markdown**: Utilize Markdown parsers (e.g., remark, markdown-it) to convert the text into a syntax tree and extract relevant sections.
119+
120+
## Questions
121+
122+
1. **Database Selection**
123+
- Which type of database should be used for storing API templates and methods? Are there any specific constraints or preferences (e.g., relational vs. NoSQL, performance requirements, ease of integration) we should consider?
124+
125+
2. **Priority of Automated Parsing**
126+
- What is the preferred approach for automated parsing of OpenAPI/HTML files? Would an AI-based parsing solution be acceptable, or should we prioritize rule-based methods for reliability and simplicity?
127+
128+
3. **UI Interaction Flow**
129+
- Can I add a dedicated “api explorer” menu in the left navigation bar for api explorer?
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# Initial Idea Submission
2+
3+
**Full Name:** Harsh Panchal
4+
5+
[**GitHub**](https://github.com/GANGSTER0910)
6+
[**Website**](https://harshpanchal0910.netlify.app/)
7+
[**LinkedIn**](https://www.linkedin.com/in/harsh-panchal-902636255)
8+
9+
**University Name:** Ahmedabad University, Ahmedabad
10+
**Program:** BTech in Computer Science and Engineering
11+
**Year:** Junior, 3rd Year
12+
**Expected Graduation Date:** May 2026
13+
**Location:** Gujarat, India.
14+
**Timezone:** Kolkata, INDIA, UTC+5:30
15+
16+
## **Project Title: AI API Eval Framework**
17+
18+
19+
## **Relevant Issues: [#618](https://github.com/foss42/apidash/issues/618)**
20+
*<Add links to the relevant issues>*
21+
22+
## **Idea Description**
23+
The goal of this project is to create an AI API Evaluation Framework that provides an end-to-end solution to compare AI models on different kinds of data, i.e., text, images, and videos. The overall strategy is to use benchmark models to compare AI outputs with benchmark predictions. Metrics like BLEU, ROUGE, FID, and SSIM can also be utilized by the users to perform an objective performance evaluation of models.
24+
25+
For the best user experience in both offline and online mode, the platform will provide an adaptive assessment framework where users can specify their own assessment criteria for flexibility in dealing with various use cases. Their will be a feature of modern version control which will enable users to compare various versions of model and moniter performance over time. For the offline mode the evalutions will be supported using LoRA models which reduces resource consumption and will give outputs without compromising with accuracy. The system will use Explainability Integration with SHAP and LIME to demonstrate how things influence model decisions.
26+
27+
The visualization dashboard, built using Flutter, will include real-time charts, error analysis, and result summarization, making it easy to analyze the model performance. Offline with cached models or online with API endpoints, the framework will offer end-to-end testing.
28+
29+
With its rank-based framework, model explainability, and evaluatable configuration, this effort will be a powerful resource for researchers, developers, and organizations to make data-driven decisions on AI model selection and deployment.
30+
31+
## Unique Features
32+
1) Benchmark-Based Ranking:
33+
Compare and rank model results against pre-trained benchmark models.
34+
Determine how well outputs resemble perfect predictions.
35+
2) Advanced Evaluation Metrics:
36+
Facilitate metrics such as BLEU, ROUGE, FID, SSIM, and PSNR for extensive analysis.
37+
Allow users to define custom metrics.
38+
3) Model Version Control:
39+
Compare various versions of AI models.
40+
Monitor improvement in performance over time with side-by-side comparison.
41+
4) Explainability Integration:
42+
Employ SHAP and LIME to explain model decisions.
43+
Provide clear explanations of why some outputs rank higher.
44+
5) Custom Evaluation Criteria:
45+
Allow users to input custom evaluation criteria for domain-specific tasks.
46+
6) Offline Mode with LoRA Models:
47+
Storage and execution efficiency with low-rank adaptation models.
48+
Conduct offline evaluations with minimal hardware demands.
49+
7) Real-Time Visualization:
50+
Visualize evaluation results using interactive charts via Flutter.
51+
Monitor performance trends and detect weak spots visually.
52+
53+

0 commit comments

Comments
 (0)