Raw SQL vs ORM — When to Use Each

Python developers building FastAPI applications have three levels of database abstraction to choose from: raw SQL with psycopg2 or asyncpg, SQLAlchemy Core (a query builder that generates SQL), and SQLAlchemy ORM (full object-relational mapping). Each level involves tradeoffs between control, abstraction, boilerplate, and performance. Most production FastAPI applications use SQLAlchemy ORM for standard CRUD operations and drop to raw SQL for complex analytics queries that the ORM cannot express efficiently. Understanding when to use each level — and how to mix them — is a key architectural skill.

Three Levels of Abstraction

from sqlalchemy import create_engine, text
from sqlalchemy.orm import Session, DeclarativeBase, Mapped, mapped_column

# ── Level 1: Raw SQL with psycopg2 ────────────────────────────────────────────
import psycopg2

conn = psycopg2.connect(dsn)
with conn.cursor() as cur:
    cur.execute(
        "SELECT id, title, author_id FROM posts WHERE status = %s AND view_count > %s",
        ("published", 100)
    )
    posts = cur.fetchall()
# Full control, maximum performance, verbose, no type safety

# ── Level 2: SQLAlchemy Core (query builder) ──────────────────────────────────
from sqlalchemy import select, Table, Column, MetaData

meta = MetaData()
posts_table = Table("posts", meta,
    Column("id"), Column("title"), Column("author_id"),
    Column("status"), Column("view_count")
)

engine = create_engine(dsn)
with engine.connect() as conn:
    stmt = select(posts_table.c.id, posts_table.c.title).where(
        posts_table.c.status == "published",
        posts_table.c.view_count > 100
    )
    result = conn.execute(stmt)
    posts = result.fetchall()
# SQL generated programmatically, composable, DB-agnostic, no ORM overhead

# ── Level 3: SQLAlchemy ORM ────────────────────────────────────────────────────
class Base(DeclarativeBase): pass

class Post(Base):
    __tablename__ = "posts"
    id:         Mapped[int]   = mapped_column(primary_key=True)
    title:      Mapped[str]
    author_id:  Mapped[int]
    status:     Mapped[str]
    view_count: Mapped[int]

with Session(engine) as session:
    posts = session.query(Post).filter(
        Post.status == "published",
        Post.view_count > 100
    ).all()
# Python objects, relationships, change tracking, less SQL knowledge needed
Note: SQLAlchemy ORM adds a layer called the identity map — it tracks all objects loaded in a session and ensures you only have one Python object per database row. When you modify a Post object and call session.commit(), SQLAlchemy automatically generates and executes the UPDATE SQL. This tracking is powerful but has overhead: every ORM object in the session consumes memory, and the session checks the identity map on every query. For read-heavy endpoints that load thousands of rows, this overhead can be significant.
Tip: The most pragmatic approach for FastAPI applications is to define SQLAlchemy ORM models (for the schema, migrations, and simple CRUD) while using session.execute(text(sql)) for complex queries. text() lets you write raw SQL while still using the session’s connection and transaction management. You get the best of both worlds: ORM for the simple 80% of queries, raw SQL for the complex 20%.
Warning: The SQLAlchemy ORM’s lazy loading feature loads related objects on first attribute access — post.author triggers a new SQL query if the author was not loaded with the post. In FastAPI, this means accessing post.author.name inside a route handler can trigger a new database query per post (the N+1 problem). Always use joinedload() or selectinload() to eagerly load relationships you know you will need in the response.

Mixing ORM and Raw SQL

from sqlalchemy.orm import Session
from sqlalchemy import text
from fastapi import FastAPI, Depends

app = FastAPI()

# ── Standard ORM for simple CRUD ─────────────────────────────────────────────
def get_post(session: Session, post_id: int) -> Post | None:
    return session.get(Post, post_id)   # uses PK lookup, most efficient

def create_post(session: Session, data: dict) -> Post:
    post = Post(**data)
    session.add(post)
    session.flush()    # writes to DB without committing, so id is available
    return post

# ── Raw SQL for complex analytics ─────────────────────────────────────────────
def get_author_stats(session: Session, limit: int = 20) -> list[dict]:
    """Complex aggregation query — easier as raw SQL than ORM."""
    result = session.execute(text("""
        SELECT
            u.id,
            u.name,
            COUNT(p.id)                   AS post_count,
            COALESCE(SUM(p.view_count), 0) AS total_views,
            MAX(p.created_at)              AS last_post_date
        FROM users u
        LEFT JOIN posts p ON u.id = p.author_id AND p.status = 'published'
        GROUP BY u.id, u.name
        HAVING COUNT(p.id) > 0
        ORDER BY total_views DESC
        LIMIT :limit
    """), {"limit": limit})
    return [dict(row._mapping) for row in result]

# ── Both in the same request ──────────────────────────────────────────────────
@app.get("/dashboard")
def dashboard(session: Session = Depends(get_db)):
    # ORM for the current user
    current_user = session.get(User, 1)

    # Raw SQL for stats
    stats = get_author_stats(session)

    return {"user": current_user.name, "stats": stats}

When to Use Each Level

Use Case Best Level Why
CRUD endpoints (GET/POST/PUT/DELETE) ORM Less code, automatic change tracking, migration sync
Load by primary key ORM (session.get) Identity map cache, one line
Simple filtered list ORM Readable, composable, Pydantic integration
Aggregates, GROUP BY reports Raw SQL via text() Complex SQL cleaner than ORM expression
Window functions, CTEs Raw SQL via text() ORM cannot express these
Bulk insert/update (100k+ rows) Raw SQL / executemany ORM overhead too high for bulk ops
Full-text search Raw SQL via text() tsvector/tsquery ops not in ORM
Schema definitions ORM models Alembic generates migrations from models

Common Mistakes

Mistake 1 — Lazy loading in FastAPI (N+1 problem)

❌ Wrong — each post.author access triggers a new query:

posts = session.query(Post).filter(Post.status == "published").limit(20).all()
return [{"title": p.title, "author": p.author.name} for p in posts]
# 20 extra SELECT queries for each author!

✅ Correct — eager load with joinedload:

from sqlalchemy.orm import joinedload
posts = session.query(Post).options(joinedload(Post.author)).filter(...).limit(20).all()
# 1 query with JOIN — no N+1 ✓

Mistake 2 — Using raw SQL strings without text() in SQLAlchemy

❌ Wrong — deprecated in SQLAlchemy 2.x:

result = session.execute("SELECT * FROM posts WHERE id = :id", {"id": 1})
# RemovedIn20Warning or error in SQLAlchemy 2.0

✅ Correct — wrap in text():

from sqlalchemy import text
result = session.execute(text("SELECT * FROM posts WHERE id = :id"), {"id": 1})   # ✓

Mistake 3 — Loading entire ORM objects when only a few columns are needed

❌ Wrong — loads all 20 columns including large body text:

posts = session.query(Post).limit(20).all()
return [{"id": p.id, "title": p.title} for p in posts]
# Loads body (possibly MB of text) for every post, then discards it

✅ Correct — select only needed columns:

from sqlalchemy import select
rows = session.execute(select(Post.id, Post.title).limit(20)).all()
return [{"id": r.id, "title": r.title} for r in rows]   # ✓ minimal data transfer

🧠 Test Yourself

You need a dashboard endpoint that shows monthly post counts for the last 12 months. Which approach is most appropriate?