FastAPI is built on ASGI (Asynchronous Server Gateway Interface) and runs on an async event loop. Understanding why async exists and what problem it solves is more important than memorising the syntax. The core insight: most web server time is spent waiting — waiting for the database to respond, waiting for an external API, waiting for a file to be read. While waiting, the CPU is idle. An async event loop uses that idle time to handle other requests. The result: a single FastAPI process can handle thousands of concurrent requests, each waiting for I/O, with far fewer resources than a multi-threaded server.
The Problem: Blocking I/O
# Synchronous (blocking) — one request at a time
import time
def fetch_user(user_id: int) -> dict:
time.sleep(0.1) # simulates a 100ms database query — CPU is idle!
return {"id": user_id, "name": "Alice"}
def handle_request():
user = fetch_user(1) # wait 100ms
profile = fetch_user(2) # wait another 100ms
posts = fetch_user(3) # wait another 100ms
return user, profile, posts
# Total: 300ms — each call blocks while waiting
# In a synchronous web server:
# Request 1 arrives → fetch_user → 100ms block → respond → Request 2 handled
# 1000 concurrent requests = 1000 threads each blocking for 100ms
# At 100ms each, max throughput ≈ 10 requests/second per thread
await (a point where it must wait for I/O), the event loop immediately switches to another coroutine that is ready to run. The key insight: during I/O waits, the CPU is free to work on something else. Async programming is about efficiently scheduling those waits, not about true parallelism.asyncio.to_thread().Concurrency vs Parallelism
# CONCURRENCY: doing multiple things by interleaving (one CPU)
# PARALLELISM: doing multiple things simultaneously (multiple CPUs)
# Concurrency analogy:
# A chef (single CPU) chops onions for 30 seconds,
# then checks the simmering pot for 5 seconds,
# then goes back to chopping.
# The chef does multiple things "at once" by switching between them.
# Parallelism analogy:
# Two chefs working simultaneously on different dishes.
# Python async = concurrency (not parallelism)
# Python multiprocessing = parallelism
# Timeline comparison:
# Sequential:
# Task A: [====100ms====]
# Task B: [====100ms====]
# Task C: [====100ms====]
# Total: 300ms
# Concurrent async (I/O-bound, waiting most of the time):
# Task A: [=10ms=][--waiting--][=5ms=] total ~15ms active
# Task B: [=10ms=][--waiting--][=5ms=] overlapping with A
# Task C: [=10ms=][--waiting--][=5ms=] overlapping with A and B
# Total: ~110ms (limited by longest single wait, not sum of waits)
The Event Loop — How It Works
import asyncio
# The event loop: a loop that runs coroutines and callbacks
# When a coroutine awaits something, it is suspended and added to a wait queue
# The event loop picks the next ready coroutine and runs it until the next await
# Simplified event loop pseudo-code:
# while True:
# for coroutine in ready_coroutines:
# coroutine.send(None) # run until next await
# if coroutine.is_done():
# remove from queue
# for future in completed_io_operations:
# wake_up_waiting_coroutines(future)
# Run the event loop:
async def main():
print("Hello from async")
asyncio.run(main()) # creates event loop, runs main(), closes loop
# In FastAPI, uvicorn manages the event loop — you never call asyncio.run() yourself
# FastAPI just needs you to mark route handlers as async def
I/O-Bound vs CPU-Bound
| Task Type | Examples | Best Solution | Async Benefit |
|---|---|---|---|
| I/O-bound | DB queries, HTTP calls, file reads | async/await | High — frees CPU during waits |
| I/O-bound (sync library) | psycopg2, requests | Thread pool via to_thread() | Medium — threads handle wait |
| CPU-bound (light) | JSON parsing, string ops | sync def in FastAPI | None needed — fast enough |
| CPU-bound (heavy) | Image resize, ML inference | multiprocessing / to_thread() | None — needs true parallelism |
Common Mistakes
Mistake 1 — Thinking async means parallel
❌ Wrong — expecting two CPU-bound tasks to run in parallel:
async def cpu_intensive():
# Runs pure Python computation — no await
return sum(x ** 2 for x in range(10_000_000))
# Running two of these "concurrently" is SLOWER than sequential!
# Each blocks the event loop for its full duration
await asyncio.gather(cpu_intensive(), cpu_intensive()) # no benefit
✅ Correct — use multiprocessing for CPU-bound work, async for I/O-bound work.
Mistake 2 — Blocking the event loop with synchronous I/O
❌ Wrong — sync database call in async handler:
async def get_post(post_id: int):
time.sleep(1) # BLOCKS the entire event loop for 1 second!
# All other requests wait too
✅ Correct — use async I/O or offload to thread pool:
async def get_post(post_id: int):
await asyncio.sleep(1) # yields control to event loop ✓
# or: await asyncio.to_thread(sync_db_call, post_id)
Mistake 3 — Using async for everything including CPU-bound tasks
❌ Wrong — async decorator on CPU-bound function adds no value and may mislead:
async def calculate_fibonacci(n):
# No I/O — no await — async keyword does nothing useful here
return n if n <= 1 else calculate_fibonacci(n-1) + calculate_fibonacci(n-2)
✅ Correct — only use async when the function actually awaits I/O:
def calculate_fibonacci(n): # sync — CPU-bound, no I/O ✓
return n if n <= 1 else calculate_fibonacci(n-1) + calculate_fibonacci(n-2)
Quick Reference
| Concept | Key Point |
|---|---|
| Event loop | Single-threaded task scheduler — runs one coroutine at a time |
| Concurrency | Interleaving tasks — one CPU, multiple tasks in progress |
| Parallelism | Simultaneous execution — requires multiple CPU cores |
| Async benefit | I/O-bound tasks — frees CPU during network/disk waits |
| GIL | Limits Python to one thread executing at a time |
| CPU-bound solution | multiprocessing or asyncio.to_thread() |
| I/O-bound solution | async def + await |