PublicProviderConf

Automated tool to fetch AI model information from various providers (PPInfra, OpenRouter, OpenAI, Google, etc.) and generate standardized JSON files for easy consumption by chatbots and other applications.

✨ Features

🤖 Standardized Format: Unified JSON output format for easy chatbot parsing
🔄 Auto Detection: Intelligent detection of model capabilities (vision, function calling, reasoning)
🌐 Multi-Provider Support: Extensible architecture supporting multiple AI model providers
⚡ Concurrent Fetching: Efficient concurrent data retrieval from multiple providers
🎯 Aggregated Output: Generate both individual provider files and complete aggregated files
🚀 GitHub Actions: Automated scheduled updates for model information

📄 Available Model Data

Access the latest AI model information in JSON format:

All Providers Combined: all.json - Complete aggregated data from all providers
OpenAI: openai.json - OpenAI models with comprehensive template matching (65+ models including GPT-5, o1/o3/o4 series)
Anthropic: anthropic.json - Anthropic Claude models (8 models including Opus 4.1)
PPInfra: ppinfra.json - PPInfra provider models
OpenRouter: openrouter.json - OpenRouter provider models
Google Gemini: gemini.json - Google Gemini API models with web-scraped details
Vercel AI Gateway: vercel.json - Vercel AI Gateway hosted models
GitHub AI Models: github_ai.json - GitHub AI Models marketplace
Tokenflux: tokenflux.json - Tokenflux marketplace models
Groq: groq.json - Groq hosted models (API key required)
DeepSeek: deepseek.json - DeepSeek models with documentation parsing

📦 Installation

Prerequisites

Node.js 18+
pnpm (recommended) or npm

Install Dependencies

git clone https://github.com/your-repo/PublicProviderConf.git
cd PublicProviderConf
pnpm install

Build (Vite)

pnpm build

This runs two Vite builds: library bundles to build/index.(mjs|cjs) and the CLI to build/cli.js.

🚀 Usage

Basic Usage

Fetch model information from all providers:

pnpm start

Specify output directory:

pnpm start -o ./output

Fetch from specific providers:

node build/cli.js fetch-providers -p openai,anthropic,ppinfra,openrouter

Development Mode

pnpm run dev
# or run specific commands directly
ts-node src/cli.ts fetch-providers -p openai,anthropic

CLI Options

# Fetch from all providers
pnpm start fetch-all [OPTIONS]

# Fetch from specific providers
pnpm start fetch-providers -p <PROVIDERS> [OPTIONS]

Options:
  -o, --output <OUTPUT>    Output directory [default: dist]
  -c, --config <CONFIG>    Config file path [default: config/providers.toml]
  -h, --help              Show help information

📋 Output Format

Individual Provider JSON

{
  "provider": "ppinfra",
  "providerName": "PPInfra", 
  "lastUpdated": "2025-01-15T10:30:00Z",
  "models": [
    {
      "id": "deepseek/deepseek-v3.1",
      "name": "Deepseek V3.1",
      "contextLength": 163840,
      "maxTokens": 163840,
      "vision": false,
      "functionCall": true,
      "reasoning": true,
      "type": "chat",
      "description": "DeepSeek-V3.1 latest model with mixed reasoning modes..."
    }
  ]
}

Aggregated JSON (all.json)

{
  "version": "1.0.0",
  "generatedAt": "2025-01-15T10:30:00Z",
  "totalModels": 38,
  "providers": {
    "ppinfra": {
      "providerId": "ppinfra",
      "providerName": "PPInfra",
      "models": [
        {
          "id": "deepseek/deepseek-v3.1",
          "name": "Deepseek V3.1",
          "contextLength": 163840,
          "maxTokens": 163840,
          "vision": false,
          "functionCall": true,
          "reasoning": true,
          "type": "chat",
          "description": "DeepSeek-V3.1 latest model..."
        }
      ]
    }
  }
}

🔧 Configuration

Provider Configuration (Optional)

Step 1: Create your configuration file

# Copy the example configuration file
cp config/providers.toml.example config/providers.toml

# Edit with your settings
nano config/providers.toml  # or use your preferred editor

Step 2: Configuration file format

# config/providers.toml
[providers.ppinfra]
api_url = "https://api.ppinfra.com/openai/v1/models"
rate_limit = 10
timeout = 30

[providers.openrouter]
api_url = "https://openrouter.ai/api/v1/models"
rate_limit = 5

[providers.gemini]
api_url = "https://generativelanguage.googleapis.com/v1beta/openai/models"
api_key_env = "GEMINI_API_KEY"  # or use api_key = "your-key"
rate_limit = 10
timeout = 60

[providers.groq]
api_url = "https://api.groq.com/openai/v1/models"
api_key_env = "GROQ_API_KEY"
rate_limit = 10
timeout = 30

🔒 Security Note: The actual config/providers.toml file is ignored by git to prevent accidental API key commits. Always use the example file as a template.

API Key Configuration

The tool supports flexible API key configuration with multiple methods and clear priority ordering:

Configuration Methods

Method 1: Environment Variables (Recommended)

# Only for providers that require API keys
export GEMINI_API_KEY="your-key-here"    # Optional for Gemini (enhances model list)
export GROQ_API_KEY="your-key-here"      # Required for Groq

Method 2: Configuration File

# First, copy the example configuration
cp config/providers.toml.example config/providers.toml

# config/providers.toml (ignored by git for security)
[providers.gemini]
api_url = "https://generativelanguage.googleapis.com/v1beta/openai/models"
# Option A: Use default environment variable
api_key_env = "GEMINI_API_KEY"
# Option B: Use custom environment variable name
# api_key_env = "MY_CUSTOM_GEMINI_KEY" 
# Option C: Direct API key (not recommended for production)
# api_key = "your-gemini-key-here"

[providers.groq]
api_url = "https://api.groq.com/openai/v1/models"
api_key_env = "GROQ_API_KEY"
# Or use direct API key (not recommended)
# api_key = "your-groq-key-here"

API Key Priority (High to Low)

Direct API key in config file (api_key field)
Environment variable specified in config (api_key_env field)
Default environment variable (e.g., GEMINI_API_KEY)

This allows you to:

Use environment variables for security (recommended)
Override per-environment using config files
Mix different approaches for different providers

Provider-Specific Notes

PPInfra: ✅ No API key required - uses public API
OpenRouter: ✅ No API key required - uses public model listing API
Vercel AI Gateway: ✅ No API key required - uses public AI Gateway API
GitHub AI Models: ✅ No API key required - uses public model listing API
Tokenflux: ✅ No API key required - uses public marketplace API
DeepSeek: ✅ No API key required - uses web scraping from documentation
Gemini: ⚠️ Optional API key - uses hybrid web scraping + API approach
Groq: ❌ API key required - private API access only
OpenAI: ❌ API key required - private API access only
Anthropic: ❌ API key required - private API access only

Gemini Provider Details

The Gemini provider implements a unique hybrid approach:

How It Works:

API Call: Fetches model list from Gemini API (model names only)
Web Scraping: Scrapes Google's documentation for detailed capabilities
Data Merging: Combines API data with scraped metadata

Behavior by API Key Status:

With API Key: Complete model list from API + enriched capabilities from scraping
Without API Key: Model list and capabilities from web scraping + fallback known models

Why Hybrid? The official Gemini API only provides model names, so web scraping is always required to get comprehensive capability information (vision, function calling, reasoning, context lengths, etc.).

DeepSeek Provider Details

The DeepSeek provider uses pure web scraping from the official DeepSeek API documentation:

How It Works:

Documentation Scraping: Parses model tables from the pricing/models page
Fallback Models: Uses known model definitions if scraping fails
Capability Detection: Analyzes model descriptions for feature detection

Supported Models:

deepseek-chat: DeepSeek-V3.1 (Non-thinking Mode) with function calling support
deepseek-reasoner: DeepSeek-V3.1 (Thinking Mode) with advanced reasoning capabilities

Why Web Scraping? DeepSeek doesn't provide a public model listing API, so documentation parsing ensures we capture the latest model information and specifications.

🤖 GitHub Actions Automation

The project includes GitHub Actions workflow with multiple trigger methods:

🕰️ Automated Triggers

Code Changes: Triggers on pushes to main branch (src/**, Cargo.toml, workflow file) - Direct commit to main
Release Tags: Automatically triggered by release-*.*.* tags

🖱️ Manual Triggers

Workflow Dispatch: Manual trigger with optional provider selection - Creates PR for review
Tag Release: Create and push a release-x.y.z tag for versioned releases

🔄 Update Mechanism

Manual/Workflow Dispatch: Creates a Pull Request for review and manual merge
Code Push Events: Direct commit to main branch (to avoid infinite loops)
Tag Events: No commits, only creates releases

📦 Release Types

Versioned Releases

# Create a versioned release
git tag release-1.0.0
git push origin release-1.0.0

# This will automatically:
# 1. Fetch latest model data
# 2. Generate JSON files
# 3. Create GitHub release with comprehensive assets
# 4. Upload individual provider archives

📄 Release Content

Each tagged release includes:

📊 Total model count and provider statistics
🕐 Generation timestamp
📦 Complete package (provider-configs-{version}.tar.gz)
🔗 Individual provider archives
📋 Direct JSON file access
💻 Integration examples

📁 Project Structure

├── src/
│   ├── models/          # Data structure definitions
│   ├── providers/       # Provider implementations
│   ├── fetcher/         # Data fetching logic
│   ├── output/          # Output processing
│   └── config/          # Configuration management
├── dist/                # Generated JSON files
├── docs/                # Detailed documentation
└── .claude/            # Claude Code configuration

🔌 Adding New Providers

The system supports two provider implementation patterns:

Template-Based Providers (Recommended for providers with minimal API metadata)

Create template file in templates/{provider}.json:

[{
  "id": "model-id",
  "name": "Model Name",
  "contextLength": 128000,
  "maxTokens": 8192,
  "vision": true,
  "functionCall": true,
  "reasoning": false,
  "type": "chat",
  "description": "Model description",
  "match": ["model-id", "versioned-model-id", "alias"]
}]

Implement provider in src/providers/{provider}.rs:

#[async_trait]
impl Provider for NewProvider {
    async fn fetch_models(&self) -> Result<Vec<ModelInfo>> {
        // Load templates and match with API response
        let templates = Self::load_templates()?;
        let api_models = self.fetch_api_models().await?;
        
        // Match API models with templates
        for api_model in api_models {
            if let Some(template) = templates.get(&api_model.id) {
                models.push(template.to_model_info());
            } else {
                models.push(self.create_auto_model(&api_model.id));
            }
        }
    }
}

Direct Conversion Providers (For APIs with rich metadata)

#[async_trait] 
impl Provider for NewProvider {
    async fn fetch_models(&self) -> Result<Vec<ModelInfo>> {
        // Direct API to ModelInfo conversion
        let response = self.client.get(&self.api_url).send().await?;
        let models = response.models.into_iter()
            .map(|m| self.convert_model(m))
            .collect();
        Ok(models)
    }
}

For detailed implementation guide, see Provider Implementation Guide and Architecture Documentation.

📊 Currently Supported Providers

✅ PPInfra - 38+ models with reasoning, function calling, and vision capability detection
✅ OpenRouter - 600+ models with comprehensive capability detection and metadata
✅ Google Gemini - Gemini models with hybrid API + web scraping approach for complete metadata
✅ Vercel AI Gateway - 200+ hosted models with pricing and capability information
✅ GitHub AI Models - 50+ models from GitHub's AI marketplace
✅ Tokenflux - 274+ marketplace models with detailed specifications
✅ Groq - 22+ high-performance models (API key required)
✅ DeepSeek - 2 models (deepseek-chat, deepseek-reasoner) with documentation parsing
✅ OpenAI - 65+ models including GPT-5 series, o1/o3/o4 reasoning models, DALL-E, Whisper, TTS, embeddings with template matching
✅ Anthropic - 8 Claude models (Opus 4.1, Opus 4, Sonnet 4, 3.7 Sonnet, 3.5 variants, Haiku) with API key support

🛠️ Development

Run Tests

pnpm test

Type Checking

pnpm run typecheck

Linting

pnpm run lint

Debug Mode

DEBUG=true pnpm run dev fetch-all

📄 Documentation

Architecture Design - Complete architecture documentation
Claude Code Configuration - Development environment setup
Provider Implementation Guide - Guide for developing new providers
Data Conversion Standards - Data standardization specifications
Format Validation Standards - JSON format validation

📈 Usage Examples

Chatbot Integration Example

// Fetch all models
const response = await fetch('https://raw.githubusercontent.com/ThinkInAIXYZ/PublicProviderConf/refs/heads/dev/dist/all.json');
const data = await response.json();

// Filter models that support function calling from all providers
const toolModels = [];
Object.values(data.providers).forEach(provider => {
  provider.models.forEach(model => {
    if (model.functionCall) {
      toolModels.push({...model, providerId: provider.providerId});
    }
  });
});

// Get models from specific provider
const ppinfraModels = data.providers.ppinfra?.models || [];

// Find models with reasoning capability across all providers
const reasoningModels = [];
Object.values(data.providers).forEach(provider => {
  provider.models.forEach(model => {
    if (model.reasoning) {
      reasoningModels.push({...model, providerId: provider.providerId});
    }
  });
});

Data Analysis

Generated JSON files can be used for:

📊 Model capability statistical analysis
🔍 Model search and filtering
💰 Price comparison analysis
📈 Model trend tracking

🤝 Contributing

Issues and Pull Requests are welcome!

Fork the project
Create a feature branch
Implement new features or fixes
Submit a Pull Request

📝 License

MIT License

🙏 Acknowledgments

Thanks to all AI model providers for offering open API interfaces, making this project possible.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.claude		.claude
.github/workflows		.github/workflows
config		config
dist		dist
node_modules/.bin		node_modules/.bin
src		src
templates		templates
tests		tests
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README-TS.md		README-TS.md
README.md		README.md
TYPESCRIPT_MIGRATION.md		TYPESCRIPT_MIGRATION.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json
vite.cli.config.ts		vite.cli.config.ts
vite.config.ts		vite.config.ts
vitest.config.ts		vitest.config.ts

License

ThinkInAIXYZ/PublicProviderConf

Folders and files

Latest commit

History

Repository files navigation