Python’s collections module provides specialised container types that extend the built-in dict, list, and tuple with additional functionality. These are not niche tools — defaultdict, Counter, deque, and namedtuple appear regularly in production FastAPI code for grouping query results, counting occurrences, implementing queues, and representing lightweight data records. Understanding nested data structure patterns — lists of dicts, dicts of lists, and the transformations between them — is also essential for shaping the complex JSON responses that modern APIs return.
defaultdict — Never Get a KeyError on Missing Keys
from collections import defaultdict
# defaultdict(factory) — creates default value on first access
# The factory is called with no arguments to produce the default
# Default value: list (for grouping)
posts_by_tag = defaultdict(list)
posts = [
{"id": 1, "title": "Intro", "tags": ["python", "fastapi"]},
{"id": 2, "title": "Models", "tags": ["python", "sqlalchemy"]},
{"id": 3, "title": "Routing", "tags": ["fastapi"]},
]
for post in posts:
for tag in post["tags"]:
posts_by_tag[tag].append(post["id"])
# First time a tag is seen: defaultdict creates [] automatically
# No KeyError, no manual "if tag not in d: d[tag] = []"
print(dict(posts_by_tag))
# {"python": [1, 2], "fastapi": [1, 3], "sqlalchemy": [2]}
# Default value: int (for counting — though Counter is better)
word_count = defaultdict(int) # default is 0
for word in "the cat sat on the mat".split():
word_count[word] += 1 # first access initialises to 0
print(dict(word_count))
# {"the": 2, "cat": 1, "sat": 1, "on": 1, "mat": 1}
# Default value: set (for unique grouping)
user_roles = defaultdict(set)
user_roles["alice"].add("editor")
user_roles["alice"].add("admin")
user_roles["bob"].add("user")
print(dict(user_roles))
# {"alice": {"editor", "admin"}, "bob": {"user"}}
defaultdict behaves exactly like a regular dict — all the same methods work (get, update, items, etc.). The only difference is what happens when you access a key that does not exist: a regular dict raises KeyError, while a defaultdict calls its factory function, stores the result, and returns it. You can convert a defaultdict back to a regular dict with dict(my_defaultdict) for clean serialisation.defaultdict eliminates the “check then insert” pattern that clutters grouping code. Instead of if key not in d: d[key] = []; d[key].append(item), you write just d[key].append(item). This is particularly clean in FastAPI when you need to group database results by a foreign key — for example, grouping comments by their post ID from a flat list of comment rows.defaultdict creates an entry for any key you access, even if you only read it. This means key in my_defaultdict is safe, but my_defaultdict[nonexistent_key] creates that key with the default value — even if you only intended to read. Use my_defaultdict.get(key) for read-only access that does not create entries, or check with in first.Counter — Count Occurrences
from collections import Counter
# Count items in an iterable
tags = ["python", "fastapi", "python", "react", "fastapi", "python"]
tag_counts = Counter(tags)
print(tag_counts)
# Counter({"python": 3, "fastapi": 2, "react": 1})
# Most common
tag_counts.most_common(2) # [("python", 3), ("fastapi", 2)]
tag_counts.most_common(1) # [("python", 3)] — top 1
# Access like a dict — returns 0 for missing (not KeyError)
tag_counts["python"] # 3
tag_counts["unknown"] # 0 — no KeyError
# Arithmetic on counters
a = Counter({"a": 3, "b": 2, "c": 1})
b = Counter({"a": 1, "b": 4, "d": 2})
print(a + b) # Counter({"b": 6, "a": 4, "d": 2, "c": 1}) — sum counts
print(a - b) # Counter({"a": 2, "c": 1}) — subtract, drop negatives
print(a & b) # Counter({"a": 1, "b": 2}) — min of each count
print(a | b) # Counter({"b": 4, "a": 3, "d": 2, "c": 1}) — max of each
# FastAPI use: trending tags over last 7 days
def get_trending_tags(recent_posts: list, top_n: int = 10) -> list:
all_tags = [tag for post in recent_posts for tag in post["tags"]]
return [tag for tag, _ in Counter(all_tags).most_common(top_n)]
deque — Efficient Double-Ended Queue
from collections import deque
# deque supports O(1) append/pop from BOTH ends
# Lists: O(1) append/pop at end, O(n) at front
queue = deque()
# Add to right (standard queue behaviour)
queue.append("task_1")
queue.append("task_2")
queue.append("task_3")
# Remove from left (FIFO — first in, first out)
first = queue.popleft() # "task_1"
# Add to left
queue.appendleft("priority_task")
# deque as a fixed-size sliding window (maxlen)
recent = deque(maxlen=5) # automatically drops oldest when full
for i in range(10):
recent.append(i)
print(list(recent)) # [5, 6, 7, 8, 9] — keeps last 5
# FastAPI use: rate limiting (last N request timestamps)
import time
request_times = deque(maxlen=100) # keep last 100 requests
def is_rate_limited(max_per_minute: int = 60) -> bool:
now = time.time()
request_times.append(now)
one_minute_ago = now - 60
recent_count = sum(1 for t in request_times if t > one_minute_ago)
return recent_count > max_per_minute
Nested Data Structure Patterns
# ── List of dicts — the standard DB query result shape ────────────────────────
posts = [
{"id": 1, "title": "A", "author_id": 10, "tags": ["python"]},
{"id": 2, "title": "B", "author_id": 20, "tags": ["fastapi", "python"]},
{"id": 3, "title": "C", "author_id": 10, "tags": ["react"]},
]
# Group by author_id
by_author = defaultdict(list)
for post in posts:
by_author[post["author_id"]].append(post)
# Filter and transform
published_titles = [p["title"] for p in posts if p.get("published", True)]
# Build an ID → post lookup dict
post_by_id = {p["id"]: p for p in posts}
print(post_by_id[2]["title"]) # "B"
# ── Dict of lists — configuration and grouping ─────────────────────────────────
ROLE_PERMISSIONS = {
"admin": ["read", "write", "delete", "manage_users"],
"editor": ["read", "write"],
"user": ["read"],
}
def get_permissions(role: str) -> list:
return ROLE_PERMISSIONS.get(role, [])
# ── Merge nested structures ────────────────────────────────────────────────────
base_config = {"db": {"host": "localhost", "port": 5432}}
env_config = {"db": {"host": "prod-db.example.com"}}
# Naive merge loses nested keys:
wrong = {**base_config, **env_config}
# {"db": {"host": "prod-db.example.com"}} — port is lost!
# Deep merge (manual for two levels):
def merge_config(base: dict, override: dict) -> dict:
result = base.copy()
for key, value in override.items():
if key in result and isinstance(result[key], dict) and isinstance(value, dict):
result[key] = merge_config(result[key], value)
else:
result[key] = value
return result
merged = merge_config(base_config, env_config)
# {"db": {"host": "prod-db.example.com", "port": 5432}} ✓
Common Mistakes
Mistake 1 — Accessing defaultdict for reads creates entries
❌ Wrong — checking a value creates an empty entry:
d = defaultdict(list)
print(d["missing"]) # [] — but now "missing" key EXISTS in d!
print("missing" in d) # True — unexpected!
✅ Correct — use .get() for reads that should not create entries:
print(d.get("missing")) # None — key NOT created ✓
print("missing" in d) # False ✓
Mistake 2 — Using list.insert(0, x) for a queue instead of deque
❌ Wrong — O(n) insertion at the front of a list:
queue = []
queue.insert(0, item) # O(n) — shifts all elements
✅ Correct — use deque for O(1) both-end operations:
queue = deque()
queue.appendleft(item) # O(1) ✓
Mistake 3 — Shallow merge losing nested dict keys
❌ Wrong — second dict overwrites entire nested dict:
config = {**base, **override} # nested dicts are replaced, not merged
✅ Correct — use a recursive merge function for nested dicts (shown above).
Quick Reference
| Tool | Import | Best For |
|---|---|---|
defaultdict(list) |
from collections import defaultdict |
Grouping items by key |
defaultdict(int) |
same | Counting (simple) |
Counter(iterable) |
from collections import Counter |
Frequency counting, most common |
deque(maxlen=N) |
from collections import deque |
Sliding window, queues |
namedtuple |
from collections import namedtuple |
Readable lightweight records |
{id: item for item in items} |
built-in | ID → object lookup dict |
defaultdict(list) + loop |
built-in | Group-by query result |