📝 How Should the Relationship Between Models and Schemas Be Managed?

When building APIs, managing the relationship between models and schemas can get messy fast. This post breaks down the main approaches you can take. By the end, you'll know which pattern keeps things clean, consistent, and scalable.

📢 Attention! This post is co-authored by GPT-4o from OpenAI.

🔰 Definitions

Model: Classes designed to interact directly with the database (e.g., ORM/ODM objects).
Schema: Classes that define the structure of data for API communication (e.g., request/response bodies in FastAPI).

⚔️ The Dilemma

There’s a subtle but significant design question in API-based applications:

How should you relate your database models and your data schemas?

There are three primary strategies to choose from:

👹 1. Defining Schemas and Models Independently

Some developers prefer to write models and schemas completely independently. While this provides maximum flexibility, it introduces high maintenance costs:

Anytime a field is added, removed or changed in one, the other must be updated manually.
It creates potential unwanted inconsistencies between internal representation (model) and external representation (schema).

🦕 2. Conventional Approach: Models Inherit → Schemas

In this approach, you consider the model as the "source of truth", and build your schemas by inheriting from it (e.g., serializers depend on models in Django REST Framework).

Pros:

Less duplication
Schemas auto-inherit all model fields

Cons:

You inherit more than you want:
- Fields like _id (MongoDB/Beanie), created_at, updated_at, password_hash, etc., are included unless explicitly removed.
Schemas may inherit behavioral methods or internal database config unrelated to API communication.
You risk exposing sensitive fields unless you're meticulous with exclude and include flags.

Examples of such fields:

_id: Beanie or MongoDB document identifier
created_at, updated_at: Timestamps meant for internal use
user_id, password_hash: Sensitive or relational internals

🧪 Code Example

# models.py
class Log(Document):
    uid: str = Field(default_factory=lambda: str(uuid4()))
    user: str
    tenant: str | None = None
    log: dict | str = Field(default_factory=dict)
    metadata: dict = Field(default_factory=dict)
    tag: str | None = None
    level: Level = Level.NOTSET
    created_at: datetime = Field(default_factory=datetime.now)

    class Settings:
        name = "logs"

# schemas.py
class LogRetrieveSchema(Log):
    class Config:
        from_attributes = True
        fields = {
            "_id": {"exclude": True},
            "created_at": {"exclude": False},
        }

👎 Issues:

You have to exclude _id manually
Schema includes insert, save, and other Document methods — unnecessary for API payloads

⏪ 3. Reverse Approach: Schemas as the Base

This flips the direction: schemas are the "core" representations, and models inherit from them.

Pros:

Schemas stay clean, lean, and API-focused
Models extend schemas to add database logic

Cons:

Multiple schemas exist for different purposes (e.g., Create, Update, Retrieve) — which one becomes the base?
Models are tightly coupled to schema structure — a schema change might force unexpected model changes.
It conflicts with some ORM/ODM expectations, which want models to define structure top-down.

🧪 Code Example

# schemas.py
class BaseSchema(BaseModel):
    tenant: str | None = None
    log: dict | str = Field(default_factory=dict)
    metadata: dict = Field(default_factory=dict)
    tag: str | None = None
    level: Level = Level.NOTSET


class LogCreateSchema(BaseSchema):
    pass


class LogRetrieveSchema(BaseSchema):
    uid: str
    created_at: datetime

# models.py
class Log(Document, LogCreateSchema):
    uid: str = Field(default_factory=lambda: str(uuid4()))
    user: str
    created_at: datetime = Field(default_factory=datetime.now)

    class Settings:
        name = "logs"

👎 Problems:

LogCreateSchema was never meant to be reused like this
Changes to schemas for API purposes may unintentionally affect your model’s DB logic

🧩 My Preferred Approach: Shared Abstract Base

🌟 Solution: Define a Shared Base Model

Create an abstract BaseLog with common fields (business logic).
Inherit from BaseLog in both your Model and Schemas.
Add model-specific fields (like _id, created_at) in your Document class.
Add schema-specific fields (e.g., computed fields, serialization tweaks) in schema subclasses.

🧪 Code Example

# models.py
from beanie import Document
from pydantic import BaseModel, Field
from enum import Enum
from datetime import datetime
from uuid import uuid4


class Level(Enum):
    INFO = "INFO"
    TRACE = "TRACE"
    DEBUG = "DEBUG"
    WARNING = "WARNING"
    ERROR = "ERROR"
    CRITICAL = "CRITICAL"
    FATAL = "FATAL"
    NOTSET = "NOTSET"


class BaseLog(BaseModel):
    tenant: str | None = None
    log: dict | str = Field(default_factory=dict)
    metadata: dict = Field(default_factory=dict)
    tag: str | None = None
    level: Level = Level.NOTSET


class Log(Document, BaseLog):
    uid: str = Field(default_factory=lambda: str(uuid4()))
    user: str
    created_at: datetime = Field(default_factory=datetime.now)

    class Settings:
        name = "logs"

# schemas.py
from datetime import datetime
from models import BaseLog


class LogCreateSchema(BaseLog):
    pass


class LogRetrieveSchema(BaseLog):
    uid: str
    created_at: datetime

✅ Benefits:

Single source of shared field definitions
No schema bloating from internal model fields
Clear separation of concerns

🧠 Final Thoughts — The Final Takeaway

While each approach has pros and cons, I strongly recommend the “Shared Abstract Base” approach as the cleanest and most maintainable solution. This method balances:

Reuse of shared field definitions without duplications
Separation of concerns between internal database logic and external API schema
Avoidance of unintended exposure of sensitive fields or unwanted methods

If you want a single, practical pattern that scales well in real projects — this is the one I use and advocate for.

In summary: define a common abstract base with shared fields and have both your models and schemas inherit from it, adding their own specifics separately.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

📝 How Should the Relationship Between Models and Schemas Be Managed?

📢 Attention! This post is co-authored by GPT-4o from OpenAI.

🔰 Definitions

⚔️ The Dilemma

👹 1. Defining Schemas and Models Independently

🦕 2. Conventional Approach: Models Inherit → Schemas

🧪 Code Example

⏪ 3. Reverse Approach: Schemas as the Base

🧪 Code Example

🧩 My Preferred Approach: Shared Abstract Base

🌟 Solution: Define a Shared Base Model

🧪 Code Example

🧠 Final Thoughts — The Final Takeaway

FilesExpand file tree

model-schema-design.md

Latest commit

History

model-schema-design.md

File metadata and controls

📝 How Should the Relationship Between Models and Schemas Be Managed?

📢 Attention! This post is co-authored by GPT-4o from OpenAI.

🔰 Definitions

⚔️ The Dilemma

👹 1. Defining Schemas and Models Independently

🦕 2. Conventional Approach: Models Inherit → Schemas

🧪 Code Example

⏪ 3. Reverse Approach: Schemas as the Base

🧪 Code Example

🧩 My Preferred Approach: Shared Abstract Base

🌟 Solution: Define a Shared Base Model

🧪 Code Example

🧠 Final Thoughts — The Final Takeaway