Skip to content

Commit ff01ea0

Browse files
feat: Add metadata models package with dynamic schema download
- Add airbyte_cdk.metadata_models package with auto-generated Pydantic models - Models are generated from JSON schemas in airbytehq/airbyte repository - Schemas are downloaded on-demand during build process (no submodules) - Uses pydantic.v1 compatibility layer for consistency with declarative models - Includes comprehensive README with usage examples - Adds py.typed marker for type hint compliance The metadata models enable validation of connector metadata.yaml files using the same schemas maintained in the main Airbyte repository. Co-Authored-By: AJ Steers <[email protected]>
1 parent 6b747fe commit ff01ea0

35 files changed

+3400
-41
lines changed
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Airbyte Metadata Models
2+
3+
This package contains Pydantic models for validating Airbyte connector `metadata.yaml` files.
4+
5+
## Overview
6+
7+
The models are automatically generated from JSON Schema YAML files maintained in the [airbytehq/airbyte](https://github.com/airbytehq/airbyte) repository at:
8+
```
9+
airbyte-ci/connectors/metadata_service/lib/metadata_service/models/src/
10+
```
11+
12+
During the CDK build process (`poetry run poe build`), these schemas are downloaded from GitHub and used to generate Pydantic models via `datamodel-code-generator`.
13+
14+
## Usage
15+
16+
### Validating a metadata.yaml file
17+
18+
```python
19+
from pathlib import Path
20+
import yaml
21+
from airbyte_cdk.metadata_models import ConnectorMetadataDefinitionV0
22+
23+
# Load metadata.yaml
24+
metadata_path = Path("path/to/metadata.yaml")
25+
metadata_dict = yaml.safe_load(metadata_path.read_text())
26+
27+
# Validate using Pydantic
28+
try:
29+
metadata = ConnectorMetadataDefinitionV0(**metadata_dict)
30+
print("✓ Metadata is valid!")
31+
except Exception as e:
32+
print(f"✗ Validation failed: {e}")
33+
```
34+
35+
### Accessing metadata fields
36+
37+
```python
38+
from airbyte_cdk.metadata_models import ConnectorMetadataDefinitionV0
39+
40+
metadata = ConnectorMetadataDefinitionV0(**metadata_dict)
41+
42+
# Access fields with full type safety
43+
print(f"Connector: {metadata.data.name}")
44+
print(f"Docker repository: {metadata.data.dockerRepository}")
45+
print(f"Docker image tag: {metadata.data.dockerImageTag}")
46+
print(f"Support level: {metadata.data.supportLevel}")
47+
```
48+
49+
### Available models
50+
51+
The main model is `ConnectorMetadataDefinitionV0`, which includes nested models for:
52+
53+
- `ConnectorType` - Source or destination
54+
- `ConnectorSubtype` - API, database, file, etc.
55+
- `SupportLevel` - Community, certified, etc.
56+
- `ReleaseStage` - Alpha, beta, generally_available
57+
- `ConnectorBreakingChanges` - Breaking change definitions
58+
- `ConnectorReleases` - Release information
59+
- `AllowedHosts` - Network access configuration
60+
- And many more...
61+
62+
## Regenerating Models
63+
64+
Models are regenerated automatically when you run:
65+
66+
```bash
67+
poetry run poe build
68+
```
69+
70+
This command:
71+
1. Downloads the latest schema YAML files from the airbyte repository
72+
2. Generates Pydantic models using `datamodel-code-generator`
73+
3. Outputs models to `airbyte_cdk/metadata_models/generated/`
74+
75+
## Schema Source
76+
77+
The authoritative schemas are maintained in the [airbyte monorepo](https://github.com/airbytehq/airbyte/tree/master/airbyte-ci/connectors/metadata_service/lib/metadata_service/models/src).
78+
79+
Any changes to metadata validation should be made there, and will be automatically picked up by the CDK build process.
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
from .generated import *
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# generated by datamodel-codegen:
2+
# filename: ActorDefinitionResourceRequirements.yaml
3+
4+
from __future__ import annotations
5+
6+
from enum import Enum
7+
from typing import List, Optional
8+
9+
from pydantic.v1 import BaseModel, Extra, Field
10+
11+
12+
class ResourceRequirements(BaseModel):
13+
class Config:
14+
extra = Extra.forbid
15+
16+
cpu_request: Optional[str] = None
17+
cpu_limit: Optional[str] = None
18+
memory_request: Optional[str] = None
19+
memory_limit: Optional[str] = None
20+
21+
22+
class JobType(Enum):
23+
get_spec = "get_spec"
24+
check_connection = "check_connection"
25+
discover_schema = "discover_schema"
26+
sync = "sync"
27+
reset_connection = "reset_connection"
28+
connection_updater = "connection_updater"
29+
replicate = "replicate"
30+
31+
32+
class JobTypeResourceLimit(BaseModel):
33+
class Config:
34+
extra = Extra.forbid
35+
36+
jobType: JobType
37+
resourceRequirements: ResourceRequirements
38+
39+
40+
class ActorDefinitionResourceRequirements(BaseModel):
41+
class Config:
42+
extra = Extra.forbid
43+
44+
default: Optional[ResourceRequirements] = Field(
45+
None,
46+
description="if set, these are the requirements that should be set for ALL jobs run for this actor definition.",
47+
)
48+
jobSpecific: Optional[List[JobTypeResourceLimit]] = None
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# generated by datamodel-codegen:
2+
# filename: AirbyteInternal.yaml
3+
4+
from __future__ import annotations
5+
6+
from enum import Enum
7+
from typing import Optional
8+
9+
from pydantic.v1 import BaseModel, Extra, Field
10+
11+
12+
class Sl(Enum):
13+
integer_0 = 0
14+
integer_100 = 100
15+
integer_200 = 200
16+
integer_300 = 300
17+
18+
19+
class Ql(Enum):
20+
integer_0 = 0
21+
integer_100 = 100
22+
integer_200 = 200
23+
integer_300 = 300
24+
integer_400 = 400
25+
integer_500 = 500
26+
integer_600 = 600
27+
28+
29+
class AirbyteInternal(BaseModel):
30+
class Config:
31+
extra = Extra.allow
32+
33+
sl: Optional[Sl] = None
34+
ql: Optional[Ql] = None
35+
isEnterprise: Optional[bool] = False
36+
requireVersionIncrementsInPullRequests: Optional[bool] = Field(
37+
True,
38+
description="When false, version increment checks will be skipped for this connector",
39+
)
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# generated by datamodel-codegen:
2+
# filename: AllowedHosts.yaml
3+
4+
from __future__ import annotations
5+
6+
from typing import List, Optional
7+
8+
from pydantic.v1 import BaseModel, Extra, Field
9+
10+
11+
class AllowedHosts(BaseModel):
12+
class Config:
13+
extra = Extra.allow
14+
15+
hosts: Optional[List[str]] = Field(
16+
None,
17+
description="An array of hosts that this connector can connect to. AllowedHosts not being present for the source or destination means that access to all hosts is allowed. An empty list here means that no network access is granted.",
18+
)
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# generated by datamodel-codegen:
2+
# filename: ConnectorBreakingChanges.yaml
3+
4+
from __future__ import annotations
5+
6+
from datetime import date
7+
from enum import Enum
8+
from typing import Any, Dict, List, Optional
9+
10+
from pydantic.v1 import AnyUrl, BaseModel, Extra, Field, constr
11+
12+
13+
class DeadlineAction(Enum):
14+
auto_upgrade = "auto_upgrade"
15+
disable = "disable"
16+
17+
18+
class StreamBreakingChangeScope(BaseModel):
19+
class Config:
20+
extra = Extra.forbid
21+
22+
scopeType: Any = Field("stream", const=True)
23+
impactedScopes: List[str] = Field(
24+
...,
25+
description="List of streams that are impacted by the breaking change.",
26+
min_items=1,
27+
)
28+
29+
30+
class BreakingChangeScope(BaseModel):
31+
__root__: StreamBreakingChangeScope = Field(
32+
...,
33+
description="A scope that can be used to limit the impact of a breaking change.",
34+
)
35+
36+
37+
class VersionBreakingChange(BaseModel):
38+
class Config:
39+
extra = Extra.forbid
40+
41+
upgradeDeadline: date = Field(
42+
...,
43+
description="The deadline by which to upgrade before the breaking change takes effect.",
44+
)
45+
message: str = Field(
46+
..., description="Descriptive message detailing the breaking change."
47+
)
48+
deadlineAction: Optional[DeadlineAction] = Field(
49+
None, description="Action to do when the deadline is reached."
50+
)
51+
migrationDocumentationUrl: Optional[AnyUrl] = Field(
52+
None,
53+
description="URL to documentation on how to migrate to the current version. Defaults to ${documentationUrl}-migrations#${version}",
54+
)
55+
scopedImpact: Optional[List[BreakingChangeScope]] = Field(
56+
None,
57+
description="List of scopes that are impacted by the breaking change. If not specified, the breaking change cannot be scoped to reduce impact via the supported scope types.",
58+
min_items=1,
59+
)
60+
61+
62+
class ConnectorBreakingChanges(BaseModel):
63+
class Config:
64+
extra = Extra.forbid
65+
66+
__root__: Dict[constr(regex=r"^\d+\.\d+\.\d+$"), VersionBreakingChange] = Field(
67+
...,
68+
description="Each entry denotes a breaking change in a specific version of a connector that requires user action to upgrade.",
69+
title="ConnectorBreakingChanges",
70+
)
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# generated by datamodel-codegen:
2+
# filename: ConnectorBuildOptions.yaml
3+
4+
from __future__ import annotations
5+
6+
from typing import Optional
7+
8+
from pydantic.v1 import BaseModel, Extra
9+
10+
11+
class ConnectorBuildOptions(BaseModel):
12+
class Config:
13+
extra = Extra.forbid
14+
15+
baseImage: Optional[str] = None
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# generated by datamodel-codegen:
2+
# filename: ConnectorIPCOptions.yaml
3+
4+
from __future__ import annotations
5+
6+
from enum import Enum
7+
from typing import List
8+
9+
from pydantic.v1 import BaseModel, Extra
10+
11+
12+
class SupportedSerializationEnum(Enum):
13+
JSONL = "JSONL"
14+
PROTOBUF = "PROTOBUF"
15+
FLATBUFFERS = "FLATBUFFERS"
16+
17+
18+
class SupportedTransportEnum(Enum):
19+
STDIO = "STDIO"
20+
SOCKET = "SOCKET"
21+
22+
23+
class DataChannel(BaseModel):
24+
class Config:
25+
extra = Extra.forbid
26+
27+
version: str
28+
supportedSerialization: List[SupportedSerializationEnum]
29+
supportedTransport: List[SupportedTransportEnum]
30+
31+
32+
class ConnectorIPCOptions(BaseModel):
33+
class Config:
34+
extra = Extra.forbid
35+
36+
dataChannel: DataChannel

0 commit comments

Comments
 (0)