Dataclasses and OOP Design Patterns

Python’s @dataclass decorator (Python 3.7+) automatically generates __init__, __repr__, __eq__, and other boilerplate methods from field declarations, significantly reducing the amount of code needed to define data-holding classes. Pydantic’s BaseModel โ€” which is the foundation of all FastAPI request and response schemas โ€” is built on similar ideas but adds runtime validation, serialisation, and JSON schema generation. Understanding dataclasses helps you understand Pydantic, and understanding OOP design patterns โ€” particularly the repository and factory patterns โ€” prepares you to structure FastAPI applications cleanly.

@dataclass Basics

from dataclasses import dataclass, field
from typing import Optional
from datetime import datetime

# Without @dataclass โ€” lots of boilerplate
class PostManual:
    def __init__(self, title: str, body: str, published: bool = False):
        self.title     = title
        self.body      = body
        self.published = published
    def __repr__(self):
        return f"Post(title={self.title!r}, published={self.published!r})"
    def __eq__(self, other):
        return (self.title, self.body) == (other.title, other.body)

# With @dataclass โ€” auto-generated __init__, __repr__, __eq__
@dataclass
class Post:
    title:     str
    body:      str
    published: bool          = False
    view_count: int          = 0
    tags:      list          = field(default_factory=list)   # mutable default!
    created_at: datetime     = field(default_factory=datetime.utcnow)

# Auto-generated __init__ โ€” same as PostManual above
p = Post(title="Hello", body="World", published=True)
print(p)   # Post(title='Hello', body='World', published=True, ...)
print(p == Post(title="Hello", body="World"))   # True (__eq__ generated)

# field(default_factory=list) โ€” fresh list for each instance (safe)
p1 = Post("A", "B")
p2 = Post("C", "D")
p1.tags.append("python")
print(p2.tags)   # [] โ€” independent โœ“
Note: Use field(default_factory=list) (or field(default_factory=dict)) for mutable default values in dataclasses โ€” never use tags: list = [] directly. Python will raise a ValueError if you try to use a mutable default directly in a dataclass because it would be shared across instances. The default_factory callable is called once per instance creation, giving each instance its own independent list.
Tip: Use @dataclass(frozen=True) to make the dataclass immutable โ€” all fields become read-only after creation, and instances become hashable (usable in sets and as dict keys). Frozen dataclasses are excellent for configuration objects, cache keys, and value objects in domain-driven design. Compare to Pydantic’s model_config = ConfigDict(frozen=True) which achieves the same for Pydantic models.
Warning: Dataclasses do not provide runtime type validation โ€” the type annotations are documentation and tooling hints only. Post(title=42, body=True) will not raise an error. For runtime validation (which is what FastAPI needs), use Pydantic’s BaseModel instead of dataclasses. The key question is: does this class need to validate data coming from an untrusted source (HTTP request)? If yes, use Pydantic. If no (internal data transfer), a dataclass is fine.

@dataclass Features

from dataclasses import dataclass, field, asdict, astuple

@dataclass(order=True)   # generates __lt__, __le__, __gt__, __ge__
class Post:
    # sort_index is used for ordering when order=True
    sort_index: int = field(init=False, repr=False)
    title:      str = ""
    body:       str = ""
    view_count: int = 0

    def __post_init__(self):
        """Called after __init__ โ€” for derived fields or validation."""
        self.sort_index = -self.view_count  # sort by views descending

posts = [Post(title="A", view_count=10), Post(title="B", view_count=50)]
sorted(posts)   # [Post(B, 50), Post(A, 10)] โ€” sorted by view_count desc

# Convert to dict or tuple
p = Post(title="Hello", body="World", view_count=5)
asdict(p)    # {"sort_index": -5, "title": "Hello", "body": "World", ...}
astuple(p)   # (-5, "Hello", "World", 5)

# Frozen dataclass โ€” immutable and hashable
@dataclass(frozen=True)
class Coordinate:
    lat: float
    lng: float

loc = Coordinate(40.7128, -74.0060)
# loc.lat = 0   # FrozenInstanceError: cannot assign to field 'lat'
coords_set = {loc, Coordinate(40.7128, -74.0060)}
print(len(coords_set))   # 1 โ€” equal frozen instances deduplicate โœ“

Dataclass vs Pydantic vs Plain Dict

Feature Plain dict dataclass Pydantic BaseModel
Syntax {"title": "x"} @dataclass class Post class Post(BaseModel)
Type hints No Yes (unenforced) Yes (enforced at runtime)
Runtime validation No No Yes โ€” raises ValidationError
Auto __init__ N/A Yes Yes
Auto __repr__ N/A Yes Yes
JSON serialise json.dumps() json.dumps(asdict()) model.model_dump_json()
JSON deserialise json.loads() Post(**json.loads()) Post.model_validate_json()
IDE support Poor Good Excellent
FastAPI use Not for schemas Not recommended Standard for all schemas

Repository Pattern โ€” OOP for Database Access

from abc import ABC, abstractmethod

# Abstract repository โ€” defines the interface
class PostRepositoryBase(ABC):
    @abstractmethod
    def get_by_id(self, id: int) -> Optional[dict]: ...
    @abstractmethod
    def list_published(self, page: int = 1, limit: int = 10) -> list: ...
    @abstractmethod
    def create(self, data: dict) -> dict: ...
    @abstractmethod
    def update(self, id: int, data: dict) -> Optional[dict]: ...
    @abstractmethod
    def delete(self, id: int) -> bool: ...

# In-memory implementation (for testing)
class InMemoryPostRepository(PostRepositoryBase):
    def __init__(self):
        self._store: dict = {}
        self._next_id: int = 1

    def get_by_id(self, id: int) -> Optional[dict]:
        return self._store.get(id)

    def list_published(self, page: int = 1, limit: int = 10) -> list:
        published = [p for p in self._store.values() if p.get("published")]
        start     = (page - 1) * limit
        return published[start:start + limit]

    def create(self, data: dict) -> dict:
        post = {**data, "id": self._next_id}
        self._store[self._next_id] = post
        self._next_id += 1
        return post

    def update(self, id: int, data: dict) -> Optional[dict]:
        if id not in self._store:
            return None
        self._store[id].update(data)
        return self._store[id]

    def delete(self, id: int) -> bool:
        return self._store.pop(id, None) is not None

# FastAPI route handler uses the abstract type โ€” not the concrete one
def get_post(post_id: int, repo: PostRepositoryBase):
    post = repo.get_by_id(post_id)
    if post is None:
        raise ValueError(f"Post {post_id} not found")
    return post

# Easy to swap in tests:
test_repo = InMemoryPostRepository()
test_repo.create({"title": "Test", "published": True})
post = get_post(1, test_repo)   # works with in-memory repo โœ“

Common Mistakes

Mistake 1 โ€” Mutable default in dataclass without field(default_factory=…)

โŒ Wrong โ€” Python raises an error immediately:

@dataclass
class Post:
    tags: list = []   # ValueError: mutable default not allowed

โœ… Correct:

@dataclass
class Post:
    tags: list = field(default_factory=list)   # โœ“

Mistake 2 โ€” Using dataclass where Pydantic is needed

โŒ Wrong โ€” no validation on incoming API data:

@dataclass
class PostCreate:
    title: str
    body:  str

# FastAPI cannot validate this automatically โ€” use Pydantic instead

โœ… Correct โ€” Pydantic model for FastAPI request/response schemas:

from pydantic import BaseModel
class PostCreate(BaseModel):
    title: str
    body:  str   # FastAPI validates and auto-documents this โœ“

Mistake 3 โ€” Concrete dependency in route handlers

โŒ Wrong โ€” route handler tightly coupled to a specific repository:

def get_post(post_id: int):
    repo = SQLAlchemyPostRepository()   # hardcoded โ€” impossible to mock in tests
    return repo.get_by_id(post_id)

โœ… Correct โ€” inject the repository through FastAPI’s Depends():

def get_post(post_id: int, repo: PostRepositoryBase = Depends(get_repo)):
    return repo.get_by_id(post_id)   # โœ“ testable, replaceable

Quick Reference

Task Code
Basic dataclass @dataclass class Post: title: str
Mutable default tags: list = field(default_factory=list)
Post-init hook def __post_init__(self):
Immutable dataclass @dataclass(frozen=True)
Sortable dataclass @dataclass(order=True)
Convert to dict asdict(instance)
Convert to tuple astuple(instance)

🧠 Test Yourself

You are building a FastAPI endpoint that creates a new blog post. The request body has fields title (required string), body (required string), and tags (optional list of strings). Should you use a plain @dataclass, a Pydantic BaseModel, or a plain dict for the request schema? Why?