Advanced Pydantic — Custom Types, Discriminated Unions and Settings

Pydantic’s advanced features solve real production problems: custom types that reuse validation logic across models, discriminated unions for APIs that accept multiple payload shapes, pydantic-settings for type-safe environment configuration, and TypeAdapter for validating values that are not full Pydantic models. These patterns appear throughout production FastAPI applications and in FastAPI’s own source code. Mastering them prepares you to handle the edge cases and complex requirements that real-world APIs always produce — and completes your Python foundation before the PostgreSQL, FastAPI, and React parts of this series.

Annotated Types — Reusable Validation

from typing import Annotated
from pydantic import BaseModel, Field, BeforeValidator, AfterValidator

# ── Reusable type aliases with constraints ────────────────────────────────────
PostTitle    = Annotated[str, Field(min_length=3, max_length=200)]
Email        = Annotated[str, Field(pattern=r"^[\w.+-]+@[\w-]+\.[a-zA-Z]{2,}$")]
PositiveInt  = Annotated[int, Field(ge=1)]
Probability  = Annotated[float, Field(ge=0.0, le=1.0)]
NonEmptyStr  = Annotated[str, Field(min_length=1)]

class PostCreate(BaseModel):
    title:   PostTitle    # reuse the constraint anywhere
    author:  NonEmptyStr
    score:   Probability = 0.5

class UserCreate(BaseModel):
    email:   Email        # same Email type reused
    name:    NonEmptyStr
    post_limit: PositiveInt = 10

# ── BeforeValidator — transform input before type validation ──────────────────
def normalise_email(v: str) -> str:
    return v.strip().lower()

NormalisedEmail = Annotated[str, BeforeValidator(normalise_email)]

class SignupForm(BaseModel):
    email: NormalisedEmail   # "  Alice@EXAMPLE.COM  " → "alice@example.com"
    name:  str

# ── AfterValidator — validate after type coercion ────────────────────────────
def must_be_unique_slug(v: str) -> str:
    if not v.replace("-", "").isalnum():
        raise ValueError("Slug must contain only letters, numbers and hyphens")
    return v.lower()

Slug = Annotated[str, AfterValidator(must_be_unique_slug)]

class PostCreate2(BaseModel):
    title: str
    slug:  Slug
Note: Annotated[Type, metadata...] is a standard Python construct from the typing module that attaches metadata to a type without changing the type itself. Pydantic reads the metadata to build validators and constraints. This is the foundation of Pydantic v2’s validation system — every Field(...) is technically an Annotated type. Using Annotated types as aliases lets you define validation rules once and reuse them consistently across dozens of models without duplication.
Tip: Define your reusable annotated types in a shared app/schemas/types.py module that all your Pydantic models import from. Common candidates: email addresses, slugs, URLs, phone numbers, UUIDs, positive integers, percentages, and content ratings. This ensures consistent validation across all endpoints — a change to the Email type affects every model that uses it simultaneously, preventing the divergence that happens when each model has its own validator.
Warning: Pydantic’s EmailStr type (from pydantic[email]) performs proper RFC 5321 email validation, while a simple regex pattern only checks the format superficially. For production applications that send emails, use from pydantic import EmailStr (requires pip install "pydantic[email]") rather than a custom pattern. A poorly-written email regex is one of the most common sources of security and usability bugs in web applications.

Discriminated Unions — Polymorphic Payloads

from pydantic import BaseModel, Field
from typing import Literal, Annotated, Union

# ── Problem: an API accepts different payload shapes based on type ──────────────
class TextContent(BaseModel):
    type:  Literal["text"]
    body:  str
    format: Literal["markdown", "html"] = "markdown"

class ImageContent(BaseModel):
    type:  Literal["image"]
    url:   str
    alt:   str = ""
    width: int | None = None

class VideoContent(BaseModel):
    type:     Literal["video"]
    url:      str
    duration: int   # seconds

# ── Discriminated Union — Pydantic uses 'type' field to pick the right model ──
Content = Annotated[
    Union[TextContent, ImageContent, VideoContent],
    Field(discriminator="type")
]

class Post(BaseModel):
    title:   str
    content: Content   # accepts TextContent, ImageContent, or VideoContent

# Pydantic reads the 'type' field and validates against the matching model
post1 = Post.model_validate({
    "title": "Hello",
    "content": {"type": "text", "body": "# Hello World"}
})
print(type(post1.content))   # TextContent

post2 = Post.model_validate({
    "title": "Photo",
    "content": {"type": "image", "url": "https://cdn.example.com/img.jpg", "alt": "A photo"}
})
print(type(post2.content))   # ImageContent

# Without discriminator, Pydantic would try each model in order — slower and
# produces confusing error messages when none match

pydantic-settings — Type-Safe Configuration

from pydantic_settings import BaseSettings
from pydantic import AnyHttpUrl, Field, PostgresDsn
from typing import Literal

class Settings(BaseSettings):
    # ── Application ───────────────────────────────────────────────────────────
    app_name:    str = "Blog API"
    environment: Literal["development", "staging", "production"] = "development"
    debug:       bool = False
    secret_key:  str   # required — must be in env or .env file

    # ── Database ──────────────────────────────────────────────────────────────
    database_url:      PostgresDsn   # validated as a PostgreSQL DSN
    db_pool_size:      int  = Field(default=10, ge=1, le=50)
    db_max_overflow:   int  = Field(default=5,  ge=0)

    # ── Auth ──────────────────────────────────────────────────────────────────
    algorithm:               str = "HS256"
    access_token_expire_min: int = Field(default=30, ge=1)
    refresh_token_expire_days: int = Field(default=7, ge=1)

    # ── CORS ──────────────────────────────────────────────────────────────────
    allowed_origins: list[AnyHttpUrl] = ["http://localhost:5173"]

    # ── File uploads ──────────────────────────────────────────────────────────
    upload_dir:       str  = "uploads"
    max_upload_mb:    int  = Field(default=10, ge=1)

    model_config = {
        "env_file":          ".env",
        "env_file_encoding": "utf-8",
        "case_sensitive":    False,   # DATABASE_URL and database_url both work
    }

# Singleton — import everywhere
settings = Settings()

# In production: set env vars on the server
# DATABASE_URL=postgresql://user:pass@host/db
# SECRET_KEY=your-256-bit-secret-key
# ENVIRONMENT=production

TypeAdapter — Validating Non-Model Types

from pydantic import TypeAdapter

# Validate simple types without defining a full BaseModel
int_adapter   = TypeAdapter(int)
list_adapter  = TypeAdapter(list[str])
email_adapter = TypeAdapter(Annotated[str, Field(pattern=r".+@.+\..+")])

int_adapter.validate_python("42")     # 42  (coerced from string)
list_adapter.validate_python([1, 2])  # ["1", "2"]  (coerced)
int_adapter.validate_json("42")       # 42  (from JSON string)

# Validate and serialise lists of models
from pydantic import BaseModel

class Post(BaseModel):
    id: int; title: str

adapter = TypeAdapter(list[Post])
posts = adapter.validate_python([
    {"id": 1, "title": "First"},
    {"id": 2, "title": "Second"},
])
print(adapter.dump_json(posts))   # b'[{"id":1,"title":"First"},...]'

# Practical use: validate a list of tags from request
TagList = TypeAdapter(Annotated[list[str], Field(max_length=10)])
try:
    tags = TagList.validate_python(["python", "fastapi"])   # ✓
except ValidationError as e:
    print("Invalid tags:", e)

Common Mistakes

Mistake 1 — Using Union without discriminator — slow and confusing errors

❌ Wrong — Pydantic tries each model sequentially:

Content = Union[TextContent, ImageContent, VideoContent]   # no discriminator
# On invalid data: tries TextContent first, then ImageContent, then VideoContent
# Error message is for the LAST model tried — not the one you intended

✅ Correct — add a discriminator field:

Content = Annotated[Union[TextContent, ImageContent, VideoContent],
                    Field(discriminator="type")]   # ✓ fast, clear errors

Mistake 2 — Putting secrets in .env.example with real values

❌ Wrong — real secret in the example file committed to git:

# .env.example
SECRET_KEY=abc123supersecret   # real secret exposed in git history!

✅ Correct — use placeholder values in .env.example:

# .env.example
SECRET_KEY=your-256-bit-secret-key-here   # placeholder only ✓
DATABASE_URL=postgresql://user:pass@localhost/dbname

Mistake 3 — Creating a new Settings() instance per request

❌ Wrong — re-reads .env on every request:

@app.get("/")
def route():
    settings = Settings()   # reads .env file on every request — expensive!

✅ Correct — use a module-level singleton or lru_cache:

from functools import lru_cache

@lru_cache
def get_settings() -> Settings:
    return Settings()   # created once, cached forever ✓

# In FastAPI:
@app.get("/")
def route(settings: Settings = Depends(get_settings)):
    ...

Quick Reference

Feature Code
Reusable type alias MyType = Annotated[str, Field(...)]
Pre-process input Annotated[str, BeforeValidator(fn)]
Post-process value Annotated[str, AfterValidator(fn)]
Discriminated union Annotated[Union[A, B], Field(discriminator="type")]
Settings from env class Settings(BaseSettings): field: type
Validate non-model TypeAdapter(list[str]).validate_python(data)
Cached settings @lru_cache def get_settings(): return Settings()

🧠 Test Yourself

You define Email = Annotated[str, Field(pattern=r".+@.+\..+")] and use it in five different Pydantic models. Later you need to change the email validation rule. How many places do you need to update?