Add model metadata fields to support pydantic-ai Popular Models page generation

## Context

This issue originates from a discussion about creating a "Popular Models" documentation page in pydantic-ai 

https://github.com/pydantic/pydantic-ai/pull/3941/files/e4a2d6d62b9a93857ce16f043b396ef3a158fd3d#diff-1ded77cd29f9a9a64cb328ffe489182b010aa1734d22bf62262efd1bfdac6995

The goal is to auto-generate this page from `genai-prices` data rather than maintaining it manually.

**Key feedback from the discussion:**
- Samuel suggested adding fields to `genai-prices` so the document can be generated from that data
- For maintenance: include date updated
- Need to list all full `provider:model` strings and link cross-provider models

## Research: Industry Standards for LLM Model Metadata

We researched how other companies/libraries handle LLM model metadata. **There is no universal industry standard**, but several de facto patterns have emerged.

### Sources Investigated
- [OpenRouter API](https://openrouter.ai/docs/api/reference/overview) - Most comprehensive aggregator schema
- [LiteLLM model_prices_and_context_window.json](https://github.com/BerriAI/litellm/blob/main/model_prices_and_context_window.json) - Widely adopted pricing database
- [LiteLLM Add Model Pricing Docs](https://docs.litellm.ai/docs/provider_registration/add_model_pricing)
- [AWS Bedrock GetFoundationModel API](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_GetFoundationModel.html)
- [Hugging Face Model Cards](https://huggingface.co/docs/hub/model-cards)
- [Anthropic List Models API](https://platform.claude.com/docs/en/api/models/list)
- [OpenAI Models API](https://platform.openai.com/docs/api-reference/models/list)

### Findings: Common Field Names Across Sources

| Concept | OpenRouter | LiteLLM | Anthropic | Bedrock | HuggingFace |
|---------|------------|---------|-----------|---------|-------------|
| Identifier | `id` | key name | `id` | `modelId` | repo name |
| Display name | `name` | - | `display_name` | `modelName` | - |
| Context length | `context_length` | `max_input_tokens` | - | - | - |
| Input modalities | `input_modalities` | `supports_vision` etc. | - | `inputModalities` | - |
| Output modalities | `output_modalities` | - | - | `outputModalities` | - |
| Provider | in ID | `litellm_provider` | - | `providerName` | - |
| Deprecation | - | `deprecation_date` | - | `modelLifecycle` | `new_version` |
| Mode/Task | `modality` | `mode` | - | - | `pipeline_tag` |

### Key Gaps in Existing Standards
1. **No universal model identity** - Same model has different IDs across providers
2. **Pricing not in APIs** - Most providers don't expose pricing via API
3. **Capabilities vary** - No standard for declaring what features a model supports

## Decision: Align with LiteLLM Conventions

LiteLLM has the most widely adopted schema and already covers most edge cases. We'll align field naming where possible.

### Fields to Add to `ModelInfo`

| Field | Type | Description | LiteLLM Equivalent |
|-------|------|-------------|-------------------|
| `max_output_tokens` | `int \| None` | Maximum output tokens (separate from context) | `max_output_tokens` |
| `modalities` | `Modalities \| None` | Input/output modality support | `supports_vision`, `supports_audio_input`, etc. |
| `capabilities` | `Capabilities \| None` | Feature flags (function calling, streaming, etc.) | `supports_function_calling`, `supports_streaming`, etc. |
| `release_date` | `date \| str \| None` | Model release date (YYYY-MM or YYYY-MM-DD) | - (we add this) |
| `deprecation_date` | `date \| None` | When model becomes deprecated | `deprecation_date` |
| `canonical_model` | `str \| None` | Reference to canonical model for cross-provider entries | Similar to HF `base_model` |

### Example Schema

```yaml
# In anthropic.yml (canonical source)
- id: claude-opus-4-5
  name: Claude Opus 4.5
  description: Premium model combining maximum intelligence with practical performance
  context_window: 200000
  max_output_tokens: 32000
  modalities:
    input: [text, image]
    output: [text]
  capabilities:
    function_calling: true
    streaming: true
    prompt_caching: true
  release_date: 2025-11
  prices:
    input_mtok: 5
    output_mtok: 25

# In aws.yml (cross-provider reference)
- id: us.anthropic.claude-opus-4-5-20251101-v1:0
  canonical_model: anthropic:claude-opus-4-5
  match:
    contains: claude-opus-4-5
  prices:
    input_mtok: 5
    output_mtok: 25
```

### What NOT to Add
- **Performance ratings** (Cost/Speed/Intelligence) - Subjective, belongs in pydantic-ai docs as editorial content
- **Tokenizer info** - Too technical for this use case
- **Supported regions** - Provider-specific, not model metadata

## Next Steps

- [ ] Add new fields to `ModelInfo` in `prices/src/prices/prices_types.py`
- [ ] Update JSON schema generation
- [ ] Add `modalities` and `capabilities` to popular models in provider YAML files (start with Anthropic, OpenAI, Google, xAI)
- [ ] Add `release_date` to popular models
- [ ] Add `canonical_model` references to cross-provider entries (AWS Bedrock, Google Vertex AI)
- [ ] Update `data.json` output to include new fields
- [ ] Document the new fields in CLAUDE.md or a schema doc

Concept	OpenRouter	LiteLLM	Anthropic	Bedrock	HuggingFace
Identifier	`id`	key name	`id`	`modelId`	repo name
Display name	`name`	-	`display_name`	`modelName`	-
Context length	`context_length`	`max_input_tokens`	-	-	-
Input modalities	`input_modalities`	`supports_vision` etc.	-	`inputModalities`	-
Output modalities	`output_modalities`	-	-	`outputModalities`	-
Provider	in ID	`litellm_provider`	-	`providerName`	-
Deprecation	-	`deprecation_date`	-	`modelLifecycle`	`new_version`
Mode/Task	`modality`	`mode`	-	-	`pipeline_tag`

Field	Type	Description	LiteLLM Equivalent
`max_output_tokens`	`int \| None`	Maximum output tokens (separate from context)	`max_output_tokens`
`modalities`	`Modalities \| None`	Input/output modality support	`supports_vision`, `supports_audio_input`, etc.
`capabilities`	`Capabilities \| None`	Feature flags (function calling, streaming, etc.)	`supports_function_calling`, `supports_streaming`, etc.
`release_date`	`date \| str \| None`	Model release date (YYYY-MM or YYYY-MM-DD)	- (we add this)
`deprecation_date`	`date \| None`	When model becomes deprecated	`deprecation_date`
`canonical_model`	`str \| None`	Reference to canonical model for cross-provider entries	Similar to HF `base_model`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add model metadata fields to support pydantic-ai Popular Models page generation #260

Context

Research: Industry Standards for LLM Model Metadata

Sources Investigated

Findings: Common Field Names Across Sources

Key Gaps in Existing Standards

Decision: Align with LiteLLM Conventions

Fields to Add to `ModelInfo`

Example Schema

What NOT to Add

Next Steps

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add model metadata fields to support pydantic-ai Popular Models page generation #260

Description

Context

Research: Industry Standards for LLM Model Metadata

Sources Investigated

Findings: Common Field Names Across Sources

Key Gaps in Existing Standards

Decision: Align with LiteLLM Conventions

Fields to Add to ModelInfo

Example Schema

What NOT to Add

Next Steps

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Fields to Add to `ModelInfo`