Expert Python Interview Questions and Answers

🐍 Expert Python Interview Questions

This lesson targets senior engineers and architects. Topics include CPython internals, the import system, Python 3.10-3.13 features, performance profiling, Cython, the free-threaded GIL, __new__, structural pattern matching, custom iterators, garbage collection cycles, async context managers, exception groups, and production architecture patterns.

Questions & Answers

01 How does CPython execute Python code internally? ►

Internals CPython executes Python through three stages: source parsing into an AST, AST compilation into bytecode, and bytecode interpretation by a stack-based VM.

Parsing — lexer tokenises source; PEG parser (Python 3.9+) builds an Abstract Syntax Tree
Compilation — AST compiled to bytecode; cached in __pycache__/*.pyc
Execution — CPython’s eval loop interprets bytecode instructions one at a time

import dis, ast

def add(a, b): return a + b

dis.dis(add)
# LOAD_FAST   0 (a)
# LOAD_FAST   1 (b)
# BINARY_OP   0 (+)
# RETURN_VALUE

tree = ast.parse("x = 1 + 2")
print(ast.dump(tree, indent=2))

# Python 3.12 specialising adaptive interpreter:
# Hot bytecodes replaced with type-specialised versions (CALL_PY_EXACT_ARGS etc.)
# Python 3.13 -- experimental JIT compiler (python3.13 --enable-experimental-jit)

02 What is Python’s import system? How do finders and loaders work? ►

Import System

import sys

# When Python sees `import mymodule`:
# 1. Check sys.modules (cache) -- return immediately if found
# 2. Iterate sys.meta_path finders -- ask each to find the module
# 3. Winning finder returns a ModuleSpec + Loader
# 4. Loader executes module code, populates sys.modules

print(sys.meta_path)
# [BuiltinImporter, FrozenImporter, PathFinder]

# Custom import audit hook
class AuditFinder:
    def find_spec(self, fullname, path, target=None):
        print(f"Importing: {fullname}")
        return None   # let next finder try

sys.meta_path.insert(0, AuditFinder())

# Custom loader -- import from a database or network
import importlib.abc, importlib.util

class DBLoader(importlib.abc.Loader):
    def __init__(self, source): self.source = source
    def exec_module(self, module):
        exec(compile(self.source, "<db>", "exec"), module.__dict__)

# Import hooks used by: pytest, coverage.py, import guards, lazy loaders

03 What are the key new features in Python 3.10, 3.11, 3.12, and 3.13? ►

New Features

3.10: Structural pattern matching (match/case), X | Y union syntax, better error messages, parenthesised context managers

3.11: 10-60% performance improvement, tomllib, asyncio.TaskGroup, asyncio.timeout(), exception groups and except*

3.12: Type parameter syntax def func[T](...) (PEP 695), inlined comprehensions, sys.monitoring, @override decorator, f-string improvements

3.13: Free-threaded mode (python3.13t, no GIL), experimental JIT, new interactive REPL, copy.replace()

# 3.10 -- structural pattern matching
def handle(event):
    match event:
        case {"type": "click", "x": x, "y": y}:
            return f"Click at ({x}, {y})"
        case {"type": "keypress", "key": key}:
            return f"Key: {key}"
        case {"type": str(t)}:
            return f"Unknown: {t}"
        case _:
            return "Not an event"

# 3.11 -- asyncio.TaskGroup
async def main():
    async with asyncio.TaskGroup() as tg:
        t1 = tg.create_task(might_fail("task1"))
        t2 = tg.create_task(might_fail("task2"))
    # All exceptions collected in ExceptionGroup if any fail

04 How do you profile Python code and find performance bottlenecks? ►

Performance

import timeit, cProfile, pstats, io, tracemalloc

# 1. timeit -- micro-benchmarks
timeit.timeit("'-'.join(str(n) for n in range(100))", number=10000)

# 2. cProfile -- deterministic profiler
pr = cProfile.Profile()
pr.enable()
# ... code to profile ...
pr.disable()
s  = io.StringIO()
ps = pstats.Stats(pr, stream=s).sort_stats(pstats.SortKey.CUMULATIVE)
ps.print_stats(20)

# Run from command line:
# python -m cProfile -s cumulative myscript.py

# 3. line_profiler -- per-line timing (pip install line_profiler)
# kernprof -l -v myscript.py

# 4. memory_profiler -- per-line memory (pip install memory_profiler)
# python -m memory_profiler myscript.py

# 5. py-spy -- sampling profiler, zero overhead, attaches to running process
# py-spy top --pid 12345
# py-spy record -o profile.svg --pid 12345   # flame graph

# 6. tracemalloc -- built-in memory tracing
tracemalloc.start()
# ... run code ...
snapshot = tracemalloc.take_snapshot()
for stat in snapshot.statistics("lineno")[:10]:
    print(stat)

05 How do you write Python C extensions? When should you use Cython or ctypes? ►

Performance

Cython — Python-like syntax compiled to C. Add type annotations for speed. Best productivity/performance balance. Used by NumPy, pandas, Scipy.
ctypes — call functions in existing .so/.dll without writing C. Good for wrapping C libraries.
cffi — modern, Pythonic alternative to ctypes.
pybind11 — C++ bindings, header-only, modern C++11.

# Cython example -- compute.pyx
# cython: language_level=3
def compute_sum(int n):        # typed parameter
    cdef long total = 0        # C stack variable
    cdef int i
    for i in range(n): total += i
    return total

# setup.py
from setuptools import setup
from Cython.Build import cythonize
setup(ext_modules=cythonize("compute.pyx",
      compiler_directives={"boundscheck": False}))
# python setup.py build_ext --inplace

# ctypes -- call existing C library
import ctypes
lib = ctypes.CDLL("./libmath.so")
lib.add.argtypes = [ctypes.c_int, ctypes.c_int]
lib.add.restype  = ctypes.c_int
result = lib.add(3, 4)   # calls C function directly

# When to extend C: hot numerical loops, wrapping C/C++ libraries
# First: rewrite algorithm (O(n^2) to O(n log n)) -- often more impactful

06 How does Python 3.13 free-threaded mode (PEP 703) work? ►

GIL / Threading PEP 703 makes the GIL optional. The free-threaded build (python3.13t) enables true CPU parallelism across threads.

# Check if GIL is enabled
import sys
sys._is_gil_enabled()   # False in free-threaded build

# Free-threaded -- CPU work truly parallelises
import threading

def compute(n, results, idx):
    results[idx] = sum(i**2 for i in range(n))

results = [0] * 4
threads = [threading.Thread(target=compute, args=(10_000_000, results, i))
           for i in range(4)]
for t in threads: t.start()
for t in threads: t.join()
# Python 3.12 (GIL): sequential -- one thread at a time
# Python 3.13t (no GIL): ~4x faster -- all threads truly parallel

# How it replaces the GIL:
# - Immortalisation: True/False/None/small ints never change refcount
# - Per-object locking: list, dict use fine-grained locks
# - Biased reference counting: local refcount per owning thread

# Status: experimental in 3.13, maturing in 3.14+
# ~10% per-thread overhead vs GIL Python (extra locking)

07 What are Python’s __new__ vs __init__? How do they work together? ►

OOP Internals __new__ allocates and returns the new instance (the real constructor). __init__ receives the already-created instance and initialises it.

# Singleton using __new__
class Singleton:
    _instance = None
    def __new__(cls):
        if cls._instance is None:
            cls._instance = super().__new__(cls)
        return cls._instance

Singleton() is Singleton()   # True

# Custom __new__ for immutable types (can't use __init__)
class NonZeroInt(int):
    def __new__(cls, value):
        if value == 0: raise ValueError("Zero not allowed")
        return super().__new__(cls, value)

NonZeroInt(5)    # 5
NonZeroInt(0)    # ValueError

# If __new__ returns a DIFFERENT class, __init__ is NOT called
class Factory:
    def __new__(cls, make_string=False):
        if make_string:
            return "I'm a string!"   # __init__ NOT called
        return super().__new__(cls)  # __init__ IS called

type(Factory(make_string=True))   # str
type(Factory(make_string=False))  # Factory

08 What is structural pattern matching (Python 3.10)? ►

Python 3.10+

# Literal patterns
def classify(point):
    match point:
        case (0, 0):        return "Origin"
        case (x, 0):        return f"X-axis at {x}"
        case (0, y):        return f"Y-axis at {y}"
        case (x, y):        return f"Point at ({x}, {y})"
        case _:             return "Not a point"

# Mapping patterns
def handle_event(event):
    match event:
        case {"action": "buy", "item": item, "qty": int(qty)}:
            return f"Buying {qty} x {item}"
        case {"action": "sell", **rest}:
            return f"Selling: {rest}"

# Sequence patterns with *rest
def parse_cmd(cmd):
    match cmd.split():
        case ["quit"]:                       return "quit"
        case ["go", direction]:              return f"go {direction}"
        case ["get", obj, *rest]:            return f"get {obj}"
        case _:                              return "unknown"

# Class patterns with __match_args__
class Point:
    __match_args__ = ("x", "y")
    def __init__(self, x, y): self.x, self.y = x, y

match Point(3, 4):
    case Point(x, y): print(f"at {x}, {y}")   # positional match

09 How do you implement a custom iterator and the iterable protocol? ►

Data Model Iterable: has __iter__ returning an iterator. Iterator: has __iter__ (returns self) and __next__ (raises StopIteration when done).

class Fibonacci:
    """Infinite Fibonacci -- its own iterator."""
    def __init__(self, limit=None):
        self.a, self.b, self.count, self.limit = 0, 1, 0, limit

    def __iter__(self): return self

    def __next__(self):
        if self.limit is not None and self.count >= self.limit:
            raise StopIteration
        result = self.a
        self.a, self.b = self.b, self.a + self.b
        self.count += 1
        return result

list(Fibonacci(10))   # [0,1,1,2,3,5,8,13,21,34]

# Reusable iterable -- creates a fresh iterator each time
class Range:
    def __init__(self, start, stop, step=1):
        self.start, self.stop, self.step = start, stop, step
    def __iter__(self):
        cur = self.start
        while cur < self.stop:
            yield cur
            cur += self.step

r = Range(0, 10, 2)
list(r)   # [0,2,4,6,8]
list(r)   # [0,2,4,6,8] -- reusable, new iterator each time

10 How does Python’s cyclic garbage collector work? ►

Memory

import gc, weakref

# Reference cycles -- refcounting cannot free these
class Node:
    def __init__(self, data): self.data = data; self.other = None

a = Node("A"); b = Node("B")
a.other = b; b.other = a    # a and b reference each other
del a, b   # refcounts drop to 1 (not 0) -- leak without cyclic GC
gc.collect()               # cyclic GC detects and frees the cycle

# Generational GC -- three generations (young to old)
gc.get_threshold()   # (700, 10, 10) default thresholds
# Gen 0: new objects, collected most often
# Gen 1: survived one Gen 0 collection
# Gen 2: long-lived objects, collected least often

# WeakValueDictionary -- track objects without preventing GC
import weakref
cache = weakref.WeakValueDictionary()
obj = ExpensiveObject()
cache["key"] = obj    # won't prevent obj from being GC'd
del obj
cache["key"]          # KeyError -- obj was collected

# weakref.finalize -- cleaner than __del__
def cleanup(resource): resource.close()
ref = weakref.finalize(obj, cleanup, obj._resource)

11 What are async context managers and async iterators? ►

Async

from contextlib import asynccontextmanager

# Async context manager -- __aenter__ / __aexit__
class AsyncDB:
    async def __aenter__(self):
        self.conn = await open_connection()
        return self.conn
    async def __aexit__(self, *args):
        await self.conn.close()
        return False

async def use():
    async with AsyncDB() as conn:
        await conn.execute("SELECT 1")

# @asynccontextmanager -- simpler
@asynccontextmanager
async def db_transaction(conn):
    async with conn.transaction():
        try:    yield conn
        except: await conn.rollback(); raise

# Async iterator -- __aiter__ / __anext__
class AsyncCounter:
    def __init__(self, limit): self.limit=limit; self.current=0
    def __aiter__(self): return self
    async def __anext__(self):
        if self.current >= self.limit: raise StopAsyncIteration
        await asyncio.sleep(0)
        self.current += 1
        return self.current

# Async generator
async def async_range(n):
    for i in range(n):
        await asyncio.sleep(0)
        yield i

async def use():
    result = [i async for i in async_range(5)]  # async list comprehension

12 What are exception groups and except* syntax (Python 3.11+)? ►

Python 3.11+ Exception Groups (PEP 654) allow multiple unrelated exceptions to be raised simultaneously — essential for asyncio.TaskGroup.

import asyncio

# except* -- handle specific exceptions from a group
try:
    raise ExceptionGroup("errors", [
        ValueError("bad value"),
        TypeError("wrong type"),
        RuntimeError("runtime problem")
    ])
except* ValueError as eg:
    print(f"Handled {len(eg.exceptions)} ValueError(s)")
except* (TypeError, RuntimeError) as eg:
    print(f"Handled {len(eg.exceptions)} other errors")

# asyncio.TaskGroup -- structured concurrency (Python 3.11+)
async def main():
    try:
        async with asyncio.TaskGroup() as tg:
            t1 = tg.create_task(might_fail("task1"))
            t2 = tg.create_task(might_fail("task2"))
            t3 = tg.create_task(succeed())
        # If t1 and t2 both fail, both exceptions are collected
        # t3's success is preserved
    except* ValueError as eg:
        for exc in eg.exceptions:
            print(f"Task failed: {exc}")

# Advantages over asyncio.gather(return_exceptions=True):
# - Cleaner syntax
# - Remaining tasks are cancelled on first failure
# - All exceptions collected, not just one

13 What is the PEP 695 generic syntax introduced in Python 3.12? ►

Python 3.12+ PEP 695 introduces a cleaner, built-in syntax for generic functions, classes, and type aliases — replacing the verbose TypeVar boilerplate.

# Python 3.11 and earlier -- verbose
from typing import TypeVar, Generic
T = TypeVar("T")
def first_old(items: list[T]) -> T: return items[0]
class Stack_old(Generic[T]):
    def push(self, item: T) -> None: ...

# Python 3.12 -- new concise syntax (PEP 695)
def first[T](items: list[T]) -> T:      # T is a type parameter
    return items[0]

class Stack[T]:                           # generic class
    def __init__(self): self._items: list[T] = []
    def push(self, item: T): self._items.append(item)
    def pop(self) -> T: return self._items.pop()

# Bounds
def add_numbers[T: (int, float)](a: T, b: T) -> T: return a + b

# Type aliases with 'type' statement
type Vector = list[float]
type Matrix = list[Vector]
type Callback[T] = (T) -> None   # generic type alias

14 How do you architect a large Python application for maintainability? ►

Architecture

my_project/
  src/
    my_project/
      __init__.py
      main.py              # entry point, wiring
      api/                 # HTTP layer -- thin, delegates to services
        v1/
          routes/
          schemas/         # Pydantic request/response models
      core/
        config.py          # pydantic-settings Settings
        dependencies.py    # DI factories
        exceptions.py      # custom exception hierarchy
      domain/              # pure Python -- no framework dependencies
        models/            # dataclasses or domain entities
        services/          # business logic
        interfaces/        # ABCs (ports)
      infrastructure/      # implement ABCs (adapters)
        repositories/
        cache/
        email/
      workers/             # background tasks (Celery/ARQ)
  tests/
    unit/                  # pure functions, no I/O
    integration/
  pyproject.toml

# Key principles:
# 1. Dependency inversion: domain imports ABCs, not infrastructure
# 2. Type hints everywhere: enables mypy strict analysis
# 3. No global mutable state: wire through DI
# 4. Explicit over implicit

from dataclasses import dataclass
@dataclass
class Container:
    db: Database; cache: Cache; email: EmailService

def create_container(s: Settings) -> Container:
    return Container(PostgreSQLDB(s.db_url), RedisCache(s.redis_url), SendGridEmail(s.key))

15 What is sys.monitoring (Python 3.12) and why is it better than sys.settrace? ►

Python 3.12+ sys.monitoring (PEP 669) provides low-overhead code instrumentation — only pays for the events you opt into, unlike sys.settrace which fires on every bytecode line.

import sys

TOOL_ID = 3
sys.monitoring.use_tool_id(TOOL_ID, "my_profiler")

call_count = {}
def call_handler(code, offset):
    name = f"{code.co_filename}:{code.co_name}"
    call_count[name] = call_count.get(name, 0) + 1
    return sys.monitoring.DISABLE  # trampoline -- disable per code object

# Enable only CALL events (much cheaper than tracing every line)
sys.monitoring.set_events(TOOL_ID, sys.monitoring.events.CALL)
sys.monitoring.register_callback(TOOL_ID, sys.monitoring.events.CALL, call_handler)

def run_code():
    for i in range(1000): [x**2 for x in range(10)]

run_code()
sys.monitoring.free_tool_id(TOOL_ID)
print(call_count)

# Why better than sys.settrace:
# 1. Only pay for events you need (CALL, LINE, RETURN, RAISE, etc.)
# 2. Trampoline: disable per-code-object after first hit -- ~10x less overhead
# 3. Multiple tools coexist (coverage + profiler simultaneously)
# 4. Used by coverage.py 7.2+, Python's built-in coverage module

16 What are contextlib utilities beyond @contextmanager? ►

Standard Library

from contextlib import (suppress, redirect_stdout, ExitStack,
                        AsyncExitStack, nullcontext, closing, chdir)
import io

# suppress -- silently ignore specific exceptions
with suppress(FileNotFoundError, PermissionError):
    os.remove("might_not_exist.tmp")

# redirect_stdout -- capture print output
out = io.StringIO()
with redirect_stdout(out):
    print("This goes to StringIO")
captured = out.getvalue()

# ExitStack -- manage a dynamic number of context managers
def process_files(filenames):
    with ExitStack() as stack:
        files = [stack.enter_context(open(f)) for f in filenames]
        return [f.read() for f in files]  # all closed on exit

# nullcontext -- placeholder when no CM is needed
def process(data, ctx=None):
    with (ctx if ctx else nullcontext()):
        return transform(data)

# chdir -- temporary directory change (Python 3.11+)
with chdir("/tmp"):
    subprocess.run(["./script.sh"])
# original directory restored

# closing -- call .close() on objects without __exit__
with closing(urllib.request.urlopen(url)) as resp:
    data = resp.read()

17 What is Python’s operator overloading? When is it appropriate? ►

Data Model

from functools import total_ordering

@total_ordering   # only need __eq__ + ONE comparison method
class Money:
    def __init__(self, amount: float, currency: str = "GBP"):
        self.amount   = round(amount, 2)
        self.currency = currency

    def __repr__(self): return f"Money({self.amount!r}, {self.currency!r})"
    def __str__(self):  return f"{self.currency} {self.amount:.2f}"

    def __add__(self, other):
        if isinstance(other, Money):
            if self.currency != other.currency:
                raise TypeError(f"Cannot add {self.currency} + {other.currency}")
            return Money(self.amount + other.amount, self.currency)
        return NotImplemented   # let Python try the reverse

    def __radd__(self, other):          # for sum([Money(1), Money(2)])
        if other == 0: return self
        return NotImplemented

    def __mul__(self, scalar):
        if isinstance(scalar, (int, float)): return Money(self.amount*scalar, self.currency)
        return NotImplemented

    __rmul__ = __mul__

    def __eq__(self, other):
        if isinstance(other, Money): return self.amount==other.amount and self.currency==other.currency
        return NotImplemented

    def __lt__(self, other):
        if isinstance(other, Money) and self.currency==other.currency: return self.amount<other.amount
        return NotImplemented

prices = [Money(30), Money(10), Money(20)]
total  = sum(prices, start=Money(0))  # uses __radd__
max(prices)                            # uses __lt__

18 What are common Python performance anti-patterns? ►

Performance

# 1. String concatenation in a loop -- O(n^2)
# BAD
result = ""
for word in words: result += word + " "
# GOOD
result = " ".join(words)

# 2. List for membership testing -- O(n)
# BAD
if item in some_list: ...         # O(n)
# GOOD
if item in some_set: ...          # O(1)

# 3. Re-compiling regex in a loop
# BAD
for item in items: re.findall(r"\d+", item)   # recompiled every time
# GOOD
pattern = re.compile(r"\d+")
for item in items: pattern.findall(item)

# 4. Unnecessary list materialisation
total = sum(list(range(1_000_000)))   # BAD: creates list
total = sum(range(1_000_000))         # GOOD: lazy range

# 5. Repeated attribute lookups in tight loops
for i in range(1_000_000): mylist.append(i)   # attribute lookup each iter
_append = mylist.append
for i in range(1_000_000): _append(i)          # ~20% faster

# 6. Python loops for numerical arrays
arr = [1.0, 2.0, 3.0, ..., 1_000_000.0]
total = sum(x**2 for x in arr)   # ~500ms (Python loop)
import numpy as np
nparr = np.array(arr)
total = (nparr**2).sum()          # ~1ms (vectorised C)

19 What is __class_getitem__ and how does it enable generic aliases? ►

Data Model __class_getitem__ is called when you subscript a class — e.g., list[int], dict[str, int]. It returns a generic alias used by the type system.

# Built-in generics (Python 3.9+)
x: list[int] = [1, 2, 3]         # list.__class_getitem__(int)
y: dict[str, int] = {"a": 1}

list[int].__origin__   # list
list[int].__args__     # (int,)

# Runtime-validated generic
class TypedList:
    def __init__(self, item_type):
        self._type = item_type
        self._data = []

    def __class_getitem__(cls, item_type):
        return cls(item_type)   # returns an INSTANCE, not just a type alias

    def append(self, item):
        if not isinstance(item, self._type):
            raise TypeError(f"Expected {self._type.__name__}, got {type(item).__name__}")
        self._data.append(item)

int_list = TypedList[int]    # creates TypedList(int)
int_list.append(5)           # OK
int_list.append("hi")        # TypeError

20 How do you build and distribute a Python package with pyproject.toml? ►

Packaging

# pyproject.toml -- modern single-file configuration (PEP 517/518/621)
[build-system]
requires      = ["hatchling"]
build-backend = "hatchling.build"

[project]
name            = "my-library"
version         = "2.1.0"
description     = "A helpful library"
readme          = "README.md"
requires-python = ">=3.10"
authors         = [{name="Alice", email="alice@example.com"}]
dependencies    = ["httpx>=0.27", "pydantic>=2.0"]

[project.optional-dependencies]
dev = ["pytest", "ruff", "mypy"]

# CLI entry point
[project.scripts]
my-cli = "my_package.cli:main"

# Tool config in the same file
[tool.pytest.ini_options]
testpaths = ["tests"]

[tool.ruff]
line-length = 88
select = ["E","F","I","N","UP"]

[tool.mypy]
strict = true

# Build and publish
# pip install build twine
# python -m build                     # creates dist/*.whl and dist/*.tar.gz
# twine upload dist/*                 # upload to PyPI

21 What are Python’s struct and array modules? When do you use them? ►

Low Level

import struct, array

# struct -- pack/unpack C binary data
# Format: '<' little-endian, '>' big-endian
# 'i'=int32, 'I'=uint32, 'f'=float32, 'd'=float64, 'H'=uint16, 'B'=uint8

# Pack
data = struct.pack("<IHH", 0x12345678, 0xABCD, 0x0001)

# Unpack
value, code, flags = struct.unpack("<IHH", data)

# Reusable struct object (faster for repeated ops)
header = struct.Struct("<IHH")
buf = header.pack(1234, 5678, 1)

# Parse binary file format
with open("data.bin", "rb") as f:
    while True:
        hdr = f.read(8)
        if not hdr: break
        msg_type, length = struct.unpack(">IH", hdr[:6])
        payload = f.read(length)

# array -- typed, contiguous arrays (between list and NumPy)
arr = array.array("d", [1.0, 2.0, 3.0])  # 'd' = C double
arr.append(4.0)
arr.tobytes()    # raw bytes
arr.tofile(open("floats.bin","wb"))

# When to use:
# struct:   network protocol parsing, binary file formats
# array:    typed homogeneous sequences (faster than list)
# NumPy:    mathematics, vectorised operations (most common choice)

📝 Knowledge Check

🧠 Quiz Question 1 of 5

🧠 Quiz Question 2 of 5

🧠 Quiz Question 3 of 5

🧠 Quiz Question 4 of 5

🧠 Quiz Question 5 of 5

Tip: Expert Python interviews reward depth over syntax recall. For the GIL, explain why it exists (reference counting atomicity) before discussing free-threaded mode. For metaclasses, describe the class creation lifecycle before showing code. For CPython internals, trace source to bytecode to execution before discussing optimisations. Context and tradeoffs — not just what but why — separate expert answers.