Python’s @dataclass decorator (Python 3.7+) automatically generates __init__, __repr__, __eq__, and other boilerplate methods from field declarations, significantly reducing the amount of code needed to define data-holding classes. Pydantic’s BaseModel โ which is the foundation of all FastAPI request and response schemas โ is built on similar ideas but adds runtime validation, serialisation, and JSON schema generation. Understanding dataclasses helps you understand Pydantic, and understanding OOP design patterns โ particularly the repository and factory patterns โ prepares you to structure FastAPI applications cleanly.
@dataclass Basics
from dataclasses import dataclass, field
from typing import Optional
from datetime import datetime
# Without @dataclass โ lots of boilerplate
class PostManual:
def __init__(self, title: str, body: str, published: bool = False):
self.title = title
self.body = body
self.published = published
def __repr__(self):
return f"Post(title={self.title!r}, published={self.published!r})"
def __eq__(self, other):
return (self.title, self.body) == (other.title, other.body)
# With @dataclass โ auto-generated __init__, __repr__, __eq__
@dataclass
class Post:
title: str
body: str
published: bool = False
view_count: int = 0
tags: list = field(default_factory=list) # mutable default!
created_at: datetime = field(default_factory=datetime.utcnow)
# Auto-generated __init__ โ same as PostManual above
p = Post(title="Hello", body="World", published=True)
print(p) # Post(title='Hello', body='World', published=True, ...)
print(p == Post(title="Hello", body="World")) # True (__eq__ generated)
# field(default_factory=list) โ fresh list for each instance (safe)
p1 = Post("A", "B")
p2 = Post("C", "D")
p1.tags.append("python")
print(p2.tags) # [] โ independent โ
field(default_factory=list) (or field(default_factory=dict)) for mutable default values in dataclasses โ never use tags: list = [] directly. Python will raise a ValueError if you try to use a mutable default directly in a dataclass because it would be shared across instances. The default_factory callable is called once per instance creation, giving each instance its own independent list.@dataclass(frozen=True) to make the dataclass immutable โ all fields become read-only after creation, and instances become hashable (usable in sets and as dict keys). Frozen dataclasses are excellent for configuration objects, cache keys, and value objects in domain-driven design. Compare to Pydantic’s model_config = ConfigDict(frozen=True) which achieves the same for Pydantic models.Post(title=42, body=True) will not raise an error. For runtime validation (which is what FastAPI needs), use Pydantic’s BaseModel instead of dataclasses. The key question is: does this class need to validate data coming from an untrusted source (HTTP request)? If yes, use Pydantic. If no (internal data transfer), a dataclass is fine.@dataclass Features
from dataclasses import dataclass, field, asdict, astuple
@dataclass(order=True) # generates __lt__, __le__, __gt__, __ge__
class Post:
# sort_index is used for ordering when order=True
sort_index: int = field(init=False, repr=False)
title: str = ""
body: str = ""
view_count: int = 0
def __post_init__(self):
"""Called after __init__ โ for derived fields or validation."""
self.sort_index = -self.view_count # sort by views descending
posts = [Post(title="A", view_count=10), Post(title="B", view_count=50)]
sorted(posts) # [Post(B, 50), Post(A, 10)] โ sorted by view_count desc
# Convert to dict or tuple
p = Post(title="Hello", body="World", view_count=5)
asdict(p) # {"sort_index": -5, "title": "Hello", "body": "World", ...}
astuple(p) # (-5, "Hello", "World", 5)
# Frozen dataclass โ immutable and hashable
@dataclass(frozen=True)
class Coordinate:
lat: float
lng: float
loc = Coordinate(40.7128, -74.0060)
# loc.lat = 0 # FrozenInstanceError: cannot assign to field 'lat'
coords_set = {loc, Coordinate(40.7128, -74.0060)}
print(len(coords_set)) # 1 โ equal frozen instances deduplicate โ
Dataclass vs Pydantic vs Plain Dict
| Feature | Plain dict | dataclass | Pydantic BaseModel |
|---|---|---|---|
| Syntax | {"title": "x"} |
@dataclass class Post |
class Post(BaseModel) |
| Type hints | No | Yes (unenforced) | Yes (enforced at runtime) |
| Runtime validation | No | No | Yes โ raises ValidationError |
| Auto __init__ | N/A | Yes | Yes |
| Auto __repr__ | N/A | Yes | Yes |
| JSON serialise | json.dumps() | json.dumps(asdict()) | model.model_dump_json() |
| JSON deserialise | json.loads() | Post(**json.loads()) | Post.model_validate_json() |
| IDE support | Poor | Good | Excellent |
| FastAPI use | Not for schemas | Not recommended | Standard for all schemas |
Repository Pattern โ OOP for Database Access
from abc import ABC, abstractmethod
# Abstract repository โ defines the interface
class PostRepositoryBase(ABC):
@abstractmethod
def get_by_id(self, id: int) -> Optional[dict]: ...
@abstractmethod
def list_published(self, page: int = 1, limit: int = 10) -> list: ...
@abstractmethod
def create(self, data: dict) -> dict: ...
@abstractmethod
def update(self, id: int, data: dict) -> Optional[dict]: ...
@abstractmethod
def delete(self, id: int) -> bool: ...
# In-memory implementation (for testing)
class InMemoryPostRepository(PostRepositoryBase):
def __init__(self):
self._store: dict = {}
self._next_id: int = 1
def get_by_id(self, id: int) -> Optional[dict]:
return self._store.get(id)
def list_published(self, page: int = 1, limit: int = 10) -> list:
published = [p for p in self._store.values() if p.get("published")]
start = (page - 1) * limit
return published[start:start + limit]
def create(self, data: dict) -> dict:
post = {**data, "id": self._next_id}
self._store[self._next_id] = post
self._next_id += 1
return post
def update(self, id: int, data: dict) -> Optional[dict]:
if id not in self._store:
return None
self._store[id].update(data)
return self._store[id]
def delete(self, id: int) -> bool:
return self._store.pop(id, None) is not None
# FastAPI route handler uses the abstract type โ not the concrete one
def get_post(post_id: int, repo: PostRepositoryBase):
post = repo.get_by_id(post_id)
if post is None:
raise ValueError(f"Post {post_id} not found")
return post
# Easy to swap in tests:
test_repo = InMemoryPostRepository()
test_repo.create({"title": "Test", "published": True})
post = get_post(1, test_repo) # works with in-memory repo โ
Common Mistakes
Mistake 1 โ Mutable default in dataclass without field(default_factory=…)
โ Wrong โ Python raises an error immediately:
@dataclass
class Post:
tags: list = [] # ValueError: mutable default not allowed
โ Correct:
@dataclass
class Post:
tags: list = field(default_factory=list) # โ
Mistake 2 โ Using dataclass where Pydantic is needed
โ Wrong โ no validation on incoming API data:
@dataclass
class PostCreate:
title: str
body: str
# FastAPI cannot validate this automatically โ use Pydantic instead
โ Correct โ Pydantic model for FastAPI request/response schemas:
from pydantic import BaseModel
class PostCreate(BaseModel):
title: str
body: str # FastAPI validates and auto-documents this โ
Mistake 3 โ Concrete dependency in route handlers
โ Wrong โ route handler tightly coupled to a specific repository:
def get_post(post_id: int):
repo = SQLAlchemyPostRepository() # hardcoded โ impossible to mock in tests
return repo.get_by_id(post_id)
โ Correct โ inject the repository through FastAPI’s Depends():
def get_post(post_id: int, repo: PostRepositoryBase = Depends(get_repo)):
return repo.get_by_id(post_id) # โ testable, replaceable
Quick Reference
| Task | Code |
|---|---|
| Basic dataclass | @dataclass class Post: title: str |
| Mutable default | tags: list = field(default_factory=list) |
| Post-init hook | def __post_init__(self): |
| Immutable dataclass | @dataclass(frozen=True) |
| Sortable dataclass | @dataclass(order=True) |
| Convert to dict | asdict(instance) |
| Convert to tuple | astuple(instance) |