Python / Core Python Fundamentals Interview Questions
Python is a high-level, interpreted, general-purpose programming language created by Guido van Rossum and first released in 1991. Its defining feature is readability: the syntax is clean and close to plain English, which dramatically lowers the learning curve compared with languages like C++ or Java.
What makes it genuinely popular rather than just beginner-friendly is the breadth of its ecosystem. The same language is used to write a two-line script that renames files and to train large neural networks. CPython (the reference implementation) runs on every major OS, and the standard library ships batteries-included — file I/O, networking, JSON, datetime, and much more without installing anything extra.
Python is dynamically typed and uses automatic memory management through garbage collection, so developers spend less time managing types and memory and more time solving problems. The Global Interpreter Lock (GIL) in CPython limits true multi-threading but has little practical impact for I/O-bound work, which covers most web services and data pipelines.
In interviews the three things worth emphasising are: interpreted execution (no compile step), dynamic typing, and the massive package ecosystem (PyPI hosts over 500,000 packages). Each of those shapes everyday development decisions.
In Python a variable is simply a name that points to an object in memory. You do not declare the type — you just assign a value and Python figures out the type at runtime. That is what dynamic typing means: the type is attached to the object, not to the name.
x = 10 # x points to an int object
x = 'hello' # now x points to a str object — perfectly legal
x = [1, 2, 3] # now x points to a list
print(type(x)) # Each assignment rebinds the name to a new object; the old object is garbage-collected when nothing else references it. This is why Python variables behave more like labels than typed containers.
Dynamic typing provides flexibility but can hide bugs that a static-type compiler would catch at build time. Python 3.5+ addresses this with optional type hints (PEP 484) that tools like mypy can check without changing runtime behaviour.
Naming conventions: use lowercase with underscores (snake_case) for variables and functions, ALL_CAPS for module-level constants. Python is case-sensitive, so count and Count are two different names.
Python uses if, elif, and else to branch execution. Unlike many languages, Python relies on indentation (four spaces by convention) rather than braces to delimit blocks — mixing tabs and spaces causes a TabError.
score = 72
if score >= 90:
grade = 'A'
elif score >= 80:
grade = 'B'
elif score >= 70:
grade = 'C'
else:
grade = 'F'
print(grade) # CConditions are any expression that evaluates to a truthy or falsy value. Python considers 0, '', [], {}, None, and False as falsy; everything else is truthy. This means you can write if my_list: instead of if len(my_list) > 0:.
Python also supports a single-line ternary expression: result = 'pass' if score >= 70 else 'fail'. It reads left-to-right: value if condition else alternative. Overusing ternaries in complex conditions hurts readability, so reserve them for simple cases.
There is no switch statement prior to Python 3.10. From 3.10 onward, the match/case structural pattern matching statement fills that role and goes far beyond a simple value switch.
Python's for loop iterates over any iterable — lists, strings, tuples, dictionaries, files, generators, and more. Unlike C-style for loops with an index counter, Python's loop just hands you each item in turn.
fruits = ['apple', 'banana', 'cherry']
for fruit in fruits:
print(fruit)
# Iterating a string character by character
for ch in 'hello':
print(ch)range() generates a lazy sequence of integers and is the standard way to loop a fixed number of times. It takes up to three arguments: range(stop), range(start, stop), or range(start, stop, step). It never stores the full list in memory — it yields one integer at a time, making it memory-efficient even for range(10_000_000).
for i in range(5): # 0 1 2 3 4
print(i)
for i in range(2, 10, 2): # 2 4 6 8
print(i)When you need both the index and the value, use enumerate() instead of manually tracking a counter:
for idx, fruit in enumerate(fruits, start=1):
print(idx, fruit) # 1 apple 2 banana 3 cherrybreak exits the loop early; continue skips to the next iteration. A for loop can also have an else clause that runs only if the loop completed without hitting break — useful for search patterns.
Use a while loop when the number of iterations is not known upfront and the loop should continue as long as a condition remains true. A for loop is for iterating over a known sequence; a while loop is for repeating until something changes.
attempts = 0
max_attempts = 3
while attempts < max_attempts:
password = input('Enter password: ')
if password == 'secret':
print('Access granted')
break
attempts += 1
print(f'{max_attempts - attempts} attempt(s) remaining')
else:
print('Account locked')The else clause on a while loop runs only if the condition naturally became False — it does not run if the loop exited via break. This is a clean way to distinguish between 'found it and broke out' versus 'exhausted all attempts'.
Common patterns where while shines: polling a queue until it is empty, reading chunks from a socket until EOF, implementing a game loop that runs until the player quits, or processing a linked list node by node without knowing its length in advance.
The most important thing to guard against is an infinite loop. Always ensure the loop variable is modified inside the loop body or use a break as an exit. A while True: loop is fine if it has a clear break condition; without one, the program hangs.
A list is Python's built-in ordered, mutable sequence. It can hold items of any type — including other lists — and grows or shrinks dynamically. Lists are backed by a C array that doubles in capacity when it runs out of space, so appending is O(1) amortised.
# Creation
items = [10, 'hello', 3.14, True]
# Indexing (zero-based, negative counts from end)
print(items[0]) # 10
print(items[-1]) # True
# Slicing [start:stop:step]
print(items[1:3]) # ['hello', 3.14]
# Mutating
items.append('new') # add to end
items.insert(1, 99) # insert at index 1
items.remove('hello') # remove first occurrence
popped = items.pop() # remove and return last element
items.sort() # in-place sort (only works if items are comparable)
items.reverse() # in-place reverse
# Length and membership
print(len(items)) # number of elements
print('new' in items) # True / FalseList comprehension is the idiomatic way to build a new list from an existing iterable:
squares = [x**2 for x in range(1, 6)] # [1, 4, 9, 16, 25]
evens = [x for x in range(10) if x % 2 == 0] # [0, 2, 4, 6, 8]Lists are passed by reference — assigning a list to a second variable gives you a second name for the same object. Use list.copy() or list[:] for a shallow copy, or copy.deepcopy() when the list contains nested mutable objects.
A tuple is an ordered, immutable sequence. Once created it cannot be changed — no appending, inserting, or item reassignment. The syntax uses parentheses (optional in many contexts) or just a comma: point = 3, 4 is a tuple.
coords = (40.7128, -74.0060) # latitude, longitude
x, y = coords # tuple unpacking
print(x) # 40.7128
# Single-element tuple needs a trailing comma
single = (42,) # tuple
not_a_tuple = (42) # just the int 42
# Tuples support indexing and slicing like lists
print(coords[0]) # 40.7128
print(coords[-1]) # -74.006When should you choose a tuple over a list? Several rules of thumb:
- Immutability intent: if the data should not change after creation — RGB colour, (lat, lon), a database record — a tuple signals that clearly.
- Dictionary keys: only hashable (immutable) objects can be dict keys. A tuple of ints or strings is hashable; a list is not.
- Slight performance edge: tuples use less memory and are slightly faster to create than lists because Python can optimise their storage.
- Named tuples:
collections.namedtupleadds field names to a tuple for readability without the overhead of a full class:Point = namedtuple('Point', ['x', 'y']).
A common interview trap: tuples are immutable, but a tuple can contain a mutable object like a list. The tuple itself cannot be changed, but the list inside it can.
A dictionary is Python's hash-map: an unordered (insertion-ordered since Python 3.7) collection of key-value pairs. Keys must be hashable (strings, numbers, tuples of hashable items); values can be anything. Lookup, insertion, and deletion are O(1) average-case.
user = {
'name': 'Alice',
'age': 30,
'active': True
}
# Access — raises KeyError if key missing
print(user['name']) # Alice
# Safe access — returns default if key missing
print(user.get('email', 'N/A')) # N/A
# Add / update
user['email'] = 'alice@example.com'
user.update({'age': 31, 'city': 'NYC'})
# Delete
del user['active']
role = user.pop('city', None) # removes and returns; default avoids KeyError
# Iterating
for key, value in user.items():
print(f'{key}: {value}')
# Keys and values as views
print(list(user.keys())) # ['name', 'age', 'email']
print(list(user.values())) # ['Alice', 31, 'alice@example.com']Dict comprehension builds dictionaries from iterables in one line:
squares = {x: x**2 for x in range(1, 6)} # {1:1, 2:4, 3:9, 4:16, 5:25}Checking membership tests keys only: 'name' in user is O(1). To check values you must iterate, which is O(n). For counting occurrences, collections.Counter is a dict subclass that auto-initialises missing keys to zero, making frequency analysis much cleaner.
A set is an unordered collection of unique, hashable objects. Internally it is a hash table, giving O(1) average-case lookup — far faster than scanning a list for large collections. Duplicates are silently dropped on creation.
tags = {'python', 'data', 'python', 'api'} # {'python', 'data', 'api'}
# Membership
print('python' in tags) # True — O(1)
# Add / remove
tags.add('ml')
tags.discard('api') # no error if missing (unlike .remove())
# Set operations
a = {1, 2, 3, 4}
b = {3, 4, 5, 6}
print(a | b) # union {1, 2, 3, 4, 5, 6}
print(a & b) # intersection {3, 4}
print(a - b) # difference {1, 2}
print(a ^ b) # symmetric diff {1, 2, 5, 6}The practical win is deduplication. Converting a list to a set and back is the fastest way to remove duplicates when order does not matter: unique = list(set(my_list)). For order-preserving deduplication use dict.fromkeys(my_list) (dicts maintain insertion order since 3.7).
frozenset is the immutable variant — hashable and usable as a dictionary key or element of another set.
Functions are defined with the def keyword, a name, parentheses for parameters, a colon, and an indented body. They are first-class objects — you can assign them to variables, pass them as arguments, and return them from other functions.
def greet(name, greeting='Hello'):
"""Return a personalised greeting string."""
return f'{greeting}, {name}!'
print(greet('Alice')) # Hello, Alice!
print(greet('Bob', 'Hi')) # Hi, Bob!
print(greet(greeting='Hey', name='Carol')) # keyword argsThe string literal immediately after the def line is the docstring — accessible via help() or function.__doc__. Always write docstrings for anything that will be reused.
Parameter types to know for interviews:
- Positional: matched left to right.
- Default:
greeting='Hello'— must come after positional args. - *args: captures any number of extra positional arguments as a tuple.
- **kwargs: captures any number of keyword arguments as a dictionary.
- Keyword-only: parameters after a bare
*must be passed by name.
def summary(*args, separator=', '):
return separator.join(str(a) for a in args)
print(summary(1, 2, 3)) # 1, 2, 3
print(summary(1, 2, 3, separator='-')) # 1-2-3A function without an explicit return statement returns None. Returning multiple values looks like separate values but Python actually returns a tuple: return x, y is return (x, y).
Scope determines where in the code a variable name is visible and accessible. Python resolves names using the LEGB rule, checking four scopes in order: Local → Enclosing → Global → Built-in.
x = 'global'
def outer():
x = 'enclosing'
def inner():
x = 'local'
print(x) # local
inner()
print(x) # enclosing
outer()
print(x) # globalL – Local: names assigned inside the current function.
E – Enclosing: names in any enclosing (outer) function's scope — relevant for nested functions and closures.
G – Global: names assigned at the module's top level.
B – Built-in: names built into Python itself — len, print, range, etc.
To assign to a global variable from inside a function, declare it with global name. To assign to an enclosing-scope variable, use nonlocal name (Python 3+). Without these declarations, Python creates a new local variable instead of modifying the outer one, which is a very common source of bugs in interviews.
count = 0
def increment():
global count
count += 1
increment()
print(count) # 1
Python uses a try/except block to catch and handle exceptions rather than crashing the program. Code that might raise an error goes inside try; the handler goes inside except.
try:
result = int(input('Enter a number: '))
print(100 / result)
except ValueError:
print('Not a valid integer.')
except ZeroDivisionError:
print('Cannot divide by zero.')
except Exception as e:
print(f'Unexpected error: {e}')
else:
print('Calculation succeeded.') # runs only if no exception
finally:
print('Always runs — good for cleanup.') # always runsKey rules: always catch the most specific exception first, broad ones last. Catching bare Exception is acceptable as a last resort, but catching BaseException is usually wrong because it swallows KeyboardInterrupt and SystemExit.
The else clause runs only when no exception was raised — a clean place to put code that should only execute on success. The finally clause always runs regardless of whether an exception occurred, making it the right place for cleanup (closing a file, releasing a lock).
You can raise your own exceptions with raise ValueError('message') and create custom exception classes by subclassing Exception. Re-raise a caught exception inside an except block with bare raise to preserve the original traceback.
Python has three main approaches to string formatting, each with different trade-offs.
name, score = 'Alice', 95.5
# 1. % formatting (old-style, C printf-inspired)
print('Name: %s, Score: %.1f' % (name, score))
# 2. str.format() (Python 2.6+ / 3)
print('Name: {}, Score: {:.1f}'.format(name, score))
print('Name: {n}, Score: {s:.1f}'.format(n=name, s=score)) # named
# 3. f-strings (Python 3.6+ — preferred)
print(f'Name: {name}, Score: {score:.1f}')
print(f'Score rounded: {round(score)}') # expressions inside {}F-strings are the modern standard and are recommended for all new code. They are faster than str.format(), more readable, and evaluate expressions inline. The colon inside the braces introduces format specifiers: {value:.2f} formats a float to two decimal places; {value:>10} right-aligns in a 10-character field; {value:,} adds thousands separators.
Python 3.12 extended f-strings to allow reusing the same quote character inside braces, removing a previous restriction. For very long template strings (email bodies, SQL queries) that are composed at runtime, str.format_map() or template strings from the string module may be cleaner than a giant f-string.
List comprehension is a concise, readable way to build a new list by describing what each element should be, rather than imperatively appending in a loop. It runs faster than an equivalent for loop + append because CPython optimises the comprehension into a single opcode sequence without repeated list method lookups.
# Regular loop approach
squares = []
for x in range(1, 6):
squares.append(x ** 2)
# Equivalent list comprehension
squares = [x ** 2 for x in range(1, 6)] # [1, 4, 9, 16, 25]
# With a filter condition
even_squares = [x ** 2 for x in range(1, 11) if x % 2 == 0]
# [4, 16, 36, 64, 100]
# Nested comprehension (matrix flattening)
matrix = [[1, 2], [3, 4], [5, 6]]
flat = [num for row in matrix for num in row]
# [1, 2, 3, 4, 5, 6]The pattern is always [expression for variable in iterable if condition]. The if clause is optional. For multiple nested loops, earlier for clauses are the outer loops — same order as you would write them imperatively.
When the result is not a list but needs to be computed once, use a generator expression ((x**2 for x in range(n))) to avoid building the full list in memory. For dictionaries use dict comprehension {k: v for ...}, for sets use set comprehension {expr for ...}.
Avoid cramming complex logic into a comprehension — if you need more than one condition or a nested if/else, a regular loop is often cleaner and easier to debug.
*args and **kwargs are conventions (the names are arbitrary; the stars are what matter) for writing functions that accept a variable number of arguments.
def log(level, *messages, separator='|', **meta):
joined = separator.join(messages)
extra = ', '.join(f'{k}={v}' for k, v in meta.items())
print(f'[{level}] {joined} ({extra})')
log('INFO', 'Server started', 'Listening on port 8080',
separator=' — ', host='localhost', port=8080)
# [INFO] Server started — Listening on port 8080 (host=localhost, port=8080)*args collects any extra positional arguments beyond the explicitly named ones into a tuple. **kwargs collects any extra keyword arguments into a dict. Both are optional — you can use either, both, or neither.
The same syntax works on the call side to unpack sequences and mappings:
def add(a, b, c):
return a + b + c
nums = [1, 2, 3]
config = {'a': 10, 'b': 20, 'c': 30}
print(add(*nums)) # 6 — unpacks the list
print(add(**config)) # 60 — unpacks the dict as keyword argsA common use case is writing wrapper or decorator functions that forward all arguments to an inner function without knowing what those arguments are. The canonical pattern is def wrapper(*args, **kwargs): return original(*args, **kwargs).
A lambda is an anonymous, single-expression function defined inline. The syntax is lambda parameters: expression. It returns the value of the expression automatically — no return keyword needed. It can have any number of parameters, including defaults and *args.
double = lambda x: x * 2
print(double(5)) # 10
# Most common use: as a sort key
people = [('Alice', 30), ('Bob', 25), ('Carol', 35)]
people.sort(key=lambda p: p[1]) # sort by age
print(people) # [('Bob', 25), ('Alice', 30), ('Carol', 35)]
# With filter and map
evens = list(filter(lambda x: x % 2 == 0, range(10)))
doubled = list(map(lambda x: x * 2, [1, 2, 3]))When is a lambda appropriate? Use it for short, throwaway callables passed to sorted(), filter(), map(), or event handlers — situations where naming the function would make the code wordier without adding clarity. The PEP 8 style guide explicitly discourages assigning a lambda to a variable (like double = lambda x: x*2) because a def statement is clearer and gives the function a proper name visible in tracebacks.
Lambdas are limited to a single expression — no statements, no multi-line logic, no assignments. If the logic is even slightly complex, write a named function with def.
One of Python's most notorious gotchas: default argument values are evaluated once at function definition time, not each time the function is called. If that default is a mutable object like a list or dict, every call that uses the default shares the same object, producing surprising accumulated state.
# BUG: the list is created once and shared across calls
def add_item(item, collection=[]):
collection.append(item)
return collection
print(add_item('a')) # ['a'] — looks fine
print(add_item('b')) # ['a', 'b'] — surprise! shared default
print(add_item('c')) # ['a', 'b', 'c']The standard fix is to use None as the default and create a fresh object inside the function body:
def add_item(item, collection=None):
if collection is None:
collection = [] # fresh list on every call
collection.append(item)
return collection
print(add_item('a')) # ['a']
print(add_item('b')) # ['b'] — independentThis issue only affects mutable objects (lists, dicts, sets, custom objects). Immutable defaults like integers, strings, and tuples are safe because they cannot be modified in place. You can inspect the current value of a function's defaults at runtime via function.__defaults__, which makes the problem visible: you will see the accumulated list growing there.
Dictionary comprehension is the clean, Pythonic way to build or transform a dict in one expression. The syntax mirrors list comprehension: {key_expr: value_expr for variable in iterable if condition}. It is commonly used when processing API response payloads, config maps, or any key-value data that needs normalisation or filtering.
# Invert a dictionary (swap keys and values)
codes = {'USD': 1, 'EUR': 2, 'GBP': 3}
inv = {v: k for k, v in codes.items()}
# {1: 'USD', 2: 'EUR', 3: 'GBP'}
# Filter an API response payload — keep only active users
users = {
'alice': {'active': True, 'role': 'admin'},
'bob': {'active': False, 'role': 'user'},
'carol': {'active': True, 'role': 'user'},
}
active_users = {name: data for name, data in users.items()
if data['active']}
# {'alice': {...}, 'carol': {...}}
# Normalise keys from camelCase API payload to snake_case
import re
payload = {'firstName': 'Alice', 'lastName': 'Smith', 'userId': 42}
to_snake = lambda s: re.sub(r'(?Nesting a comprehension inside another is possible but quickly becomes hard to read. If the transformation logic exceeds one or two conditions, break it into a helper function and call it from the comprehension. Dict comprehension also pairs naturally with zip() when you have two parallel sequences of keys and values:
headers = ['name', 'age', 'city']
values = ['Alice', 30, 'NYC']
record = {k: v for k, v in zip(headers, values)}
# {'name': 'Alice', 'age': 30, 'city': 'NYC'}
Slicing extracts a sub-sequence from any sequence type using the notation sequence[start:stop:step]. All three parts are optional and default to the beginning, end, and a step of 1 respectively. Slicing always returns a new object of the same type — it does not modify the original.
data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
data[2:5] # [2, 3, 4] — stop is exclusive
data[:4] # [0, 1, 2, 3] — from beginning
data[6:] # [6, 7, 8, 9] — to end
data[::2] # [0, 2, 4, 6, 8] — every other element
data[::-1] # [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] — reverse
data[-3:] # [7, 8, 9] — last three elements
data[1:8:3] # [1, 4, 7] — start=1, stop=8, step=3Strings and tuples behave identically. text = 'Hello, World!'; text[7:] gives 'World!'. The [::-1] idiom is the classic one-liner to reverse a string in Python.
Slicing on lists returns a shallow copy. Assigning to a slice mutates the list in-place, which is a powerful but occasionally surprising feature:
data[2:5] = [20, 30] # replace three elements with two
print(data) # [0, 1, 20, 30, 5, 6, 7, 8, 9] — length changedFor more reusable slice objects, slice(start, stop, step) creates a first-class slice that can be stored and reused: s = slice(1, 8, 2); data[s].
Unpacking assigns the individual elements of a sequence to multiple variables in a single statement. The left side must have the same number of names as the sequence has elements, or you get a ValueError.
# Basic unpacking
x, y, z = (10, 20, 30)
first, second = 'AB'
# Swap without a temp variable
a, b = 1, 2
a, b = b, a
print(a, b) # 2 1
# Unpacking a function return value
def min_max(nums):
return min(nums), max(nums)
lo, hi = min_max([5, 2, 8, 1])
print(lo, hi) # 1 8Extended unpacking (Python 3+) uses the starred expression *rest to collect everything that does not fit into the explicit names:
first, *middle, last = [1, 2, 3, 4, 5]
print(first) # 1
print(middle) # [2, 3, 4]
print(last) # 5
# Useful for parsing structured payloads
header, *records = open('data.csv').readlines()
# Discard parts you don't need with _
_, important, _ = ('ignore', 'keep this', 'ignore')Unpacking works on any iterable — lists, tuples, strings, generators, files. You can also unpack in a for loop: for x, y in [(1,2),(3,4)]:. Nested unpacking (a, (b, c)) = (1, (2, 3)) works but hurts readability; prefer flatter structures.
These three keywords control loop flow in different ways, and confusing them is a common source of bugs.
pass: Does absolutely nothing. It is a syntactic placeholder used wherever Python requires a statement but you have nothing to write yet — an empty function body, an empty class, a stub except block. It is not a loop-control statement; it just lets the loop body be syntactically valid.
def todo():
pass # implement later
for i in range(5):
pass # loop runs 5 times doing nothingbreak: Immediately exits the innermost enclosing loop. Any else clause on the loop is skipped. Typical use: linear search where you want to stop as soon as you find the item.
for name in ['Alice', 'Bob', 'Carol']:
if name == 'Bob':
print('Found Bob')
breakcontinue: Skips the rest of the current iteration and jumps immediately to the next one. The loop itself keeps running.
for i in range(10):
if i % 2 == 0:
continue # skip even numbers
print(i) # prints 1 3 5 7 9A common interview trick question: what does pass do in a loop that also prints something before the pass? The answer is: the print still executes — pass only affects the statement position, not any previous code in the block.
None is Python's null value — a singleton object of type NoneType. It represents the absence of a value: default function returns, uninitialised optional variables, missing dict values. There is exactly one None object in any Python process.
The key distinction for comparisons:
==tests equality: do the two objects have the same value? It calls__eq__and can be overridden.istests identity: are the two names pointing to the exact same object in memory (sameid())?
# Correct: test for None with 'is'
result = some_function()
if result is None:
print('No result returned')
# Why not ==?
class Weird:
def __eq__(self, other):
return True # lies — claims equality with everything
w = Weird()
print(w == None) # True — because __eq__ lies
print(w is None) # False — identity check cannot be fakedPEP 8 says explicitly: use is and is not when comparing against None or the boolean singletons True/False. The is check is also marginally faster because it does not invoke any dunder method — it is a direct pointer comparison.
A related trap: CPython caches small integers (typically -5 to 256) and short strings, so a = 256; b = 256; a is b is True. But a = 1000; b = 1000; a is b may be False. Never rely on is for value comparisons — only use it for singletons like None.
A list of dictionaries is the most common Python pattern for representing structured data from APIs, CSV rows, database query results, and JSON payloads. Each dictionary is one record; the list is the collection.
employees = [
{'name': 'Alice', 'dept': 'Engineering', 'salary': 95000},
{'name': 'Bob', 'dept': 'Marketing', 'salary': 72000},
{'name': 'Carol', 'dept': 'Engineering', 'salary': 105000},
{'name': 'Dave', 'dept': 'Marketing', 'salary': 68000},
]
# Filter: Engineering employees
eng = [e for e in employees if e['dept'] == 'Engineering']
# Map: extract names only
names = [e['name'] for e in employees]
# Sort by salary descending
ranked = sorted(employees, key=lambda e: e['salary'], reverse=True)
# Group by department using defaultdict
from collections import defaultdict
by_dept = defaultdict(list)
for emp in employees:
by_dept[emp['dept']].append(emp['name'])
# {'Engineering': ['Alice', 'Carol'], 'Marketing': ['Bob', 'Dave']}
# Average salary per department
dept_salary = {}
for dept, members in by_dept.items():
salaries = [e['salary'] for e in employees if e['name'] in members]
dept_salary[dept] = sum(salaries) / len(salaries)When accessing nested values that may not exist, chain .get() calls or use a library like glom for deeply nested paths. Safe access pattern: record.get('address', {}).get('city', 'Unknown').
A generator is a function that uses the yield keyword to return values one at a time, pausing execution between yields and resuming from the same point when the next value is requested. It produces an iterator without building the entire result in memory.
# List builds everything in memory first
squares_list = [x**2 for x in range(1_000_000)] # ~8 MB
# Generator yields one value at a time — constant memory
def squares_gen(n):
for x in range(n):
yield x ** 2
gen = squares_gen(1_000_000)
print(next(gen)) # 0
print(next(gen)) # 1
print(next(gen)) # 4
# Or use a generator expression (same thing, less code)
gen2 = (x**2 for x in range(1_000_000))Generators are lazy — they compute the next value only when asked. This makes them ideal for: large file processing (stream lines without loading the whole file), infinite sequences, data pipelines, and any situation where you do not need all results at once.
# Stream a huge log file without loading it into memory
def error_lines(filepath):
with open(filepath) as f:
for line in f:
if 'ERROR' in line:
yield line.strip()
for line in error_lines('/var/log/app.log'):
print(line)Once a generator is exhausted (raises StopIteration) it cannot be reset — you must create a new generator object. This is the key difference from a list, which can be iterated multiple times.
A decorator is a function that takes another function as input, wraps it with extra behaviour, and returns the wrapped version. The @decorator syntax is shorthand for func = decorator(func). Decorators exploit the fact that Python functions are first-class objects.
import time
def timer(func):
def wrapper(*args, **kwargs):
start = time.perf_counter()
result = func(*args, **kwargs) # call the original
end = time.perf_counter()
print(f'{func.__name__} took {end-start:.4f}s')
return result
return wrapper
@timer
def compute(n):
return sum(range(n))
compute(1_000_000)
# compute took 0.0312sThe problem with the naive version above is that wrapper.__name__ is 'wrapper', not 'compute', which confuses debuggers and documentation tools. Always apply functools.wraps(func) to the inner wrapper to preserve the original function's metadata:
from functools import wraps
def timer(func):
@wraps(func)
def wrapper(*args, **kwargs):
...
return wrapperDecorators can be stacked — @dec1 over @dec2 applies dec2 first, then dec1. Common built-in decorators: @staticmethod, @classmethod, @property, @functools.lru_cache (memoisation). In frameworks, @app.route in Flask and @pytest.fixture are decorator-based APIs.
A class is a blueprint for objects. Define it with the class keyword. The special method __init__ (the constructor) is called automatically when you create an instance and is where you set up the object's initial state by assigning to self.attribute.
class BankAccount:
interest_rate = 0.03 # class attribute — shared by all instances
def __init__(self, owner, balance=0):
self.owner = owner # instance attributes
self.balance = balance
def deposit(self, amount):
if amount <= 0:
raise ValueError('Deposit amount must be positive')
self.balance += amount
def __repr__(self):
return f'BankAccount({self.owner!r}, balance={self.balance})'
acc = BankAccount('Alice', 1000)
acc.deposit(500)
print(acc) # BankAccount('Alice', balance=1500)
print(acc.interest_rate) # 0.03self is a reference to the current instance; it is not a keyword but the universal convention. Every instance method receives it as the first parameter. Class attributes are defined directly in the class body and shared across all instances; instance attributes are set with self.attr = value inside methods and belong to each object individually.
Important dunder methods to know: __str__ (readable string for end users, called by print), __repr__ (unambiguous string for developers, called in the REPL), __len__, __eq__, __lt__, and __enter__/__exit__ for context managers.
Inheritance lets a child class reuse and extend behaviour from a parent class. Specify the parent in parentheses after the class name. The child gets all the parent's methods automatically and can override any of them.
class Animal:
def __init__(self, name):
self.name = name
def speak(self):
raise NotImplementedError
class Dog(Animal):
def speak(self):
return f'{self.name} says Woof!'
class Cat(Animal):
def speak(self):
return f'{self.name} says Meow!'
animals = [Dog('Rex'), Cat('Whiskers')]
for a in animals:
print(a.speak()) # polymorphism — same call, different behavioursuper() calls the parent class's version of a method, essential when overriding __init__ to extend rather than replace the parent's initialisation:
class ServiceDog(Dog):
def __init__(self, name, service_type):
super().__init__(name) # call Dog -> Animal __init__
self.service_type = service_typePython supports multiple inheritance: class C(A, B):. The MRO (Method Resolution Order) determines which class's method is used when there is ambiguity. Python uses the C3 linearisation algorithm. Inspect it with ClassName.__mro__ or ClassName.mro(). The order goes left-to-right through the parent list, depth-first, with a rule ensuring every class appears before its own parents.
Python's built-in open() function returns a file object. Always use it as a context manager with with — this guarantees the file is closed (and the OS buffer flushed) even if an exception occurs, avoiding resource leaks.
# Writing a file
with open('notes.txt', 'w', encoding='utf-8') as f:
f.write('Line one\n')
f.writelines(['Line two\n', 'Line three\n'])
# Reading the whole file at once
with open('notes.txt', encoding='utf-8') as f:
content = f.read() # one big string
# Reading line by line (memory-efficient for large files)
with open('notes.txt', encoding='utf-8') as f:
for line in f: # file object is itself an iterator
print(line.rstrip()) # strip trailing newline
# Reading all lines into a list
with open('notes.txt', encoding='utf-8') as f:
lines = f.readlines() # ['Line one\n', 'Line two\n', ...]Mode strings: 'r' (read, default), 'w' (write, truncates), 'a' (append), 'x' (exclusive create, fails if exists), 'b' suffix for binary mode ('rb', 'wb'). Always specify encoding='utf-8' explicitly — relying on the platform default causes bugs on Windows where the default is often cp1252.
For JSON specifically, import json and use json.load(f) / json.dump(data, f, indent=2) inside a with open() block. For CSV, the csv.DictReader and csv.DictWriter classes handle quoting and delimiter edge cases correctly.
A module is any .py file. Importing it executes the file (once per interpreter session; subsequent imports reuse the cached version from sys.modules) and makes its names available in the importing namespace.
# Importing the whole module — access via module.name
import math
print(math.sqrt(16)) # 4.0
# Importing specific names — available without prefix
from math import sqrt, pi
print(sqrt(25)) # 5.0
# Import with alias — avoid name clashes or shorten long names
import numpy as np
import pandas as pd
# Star import — pulls all public names (avoid in production code)
from math import *Python looks for modules in this order: (1) built-in modules compiled into the interpreter, (2) sys.modules cache, (3) directories listed in sys.path — which includes the directory of the script being run, PYTHONPATH env var locations, and site-packages.
A package is a directory containing an __init__.py file (can be empty). Nested packages create a hierarchy: from mypackage.utils import helper. Python 3.3+ introduced namespace packages (no __init__.py needed), but regular packages with __init__.py are still the norm.
The if __name__ == '__main__': guard at the bottom of a module lets you write code that runs when the file is executed directly but not when imported as a module. It is the standard way to write both importable modules and runnable scripts in the same file.
Python's built-in namespace contains roughly 70 functions. The ones that come up constantly in interview problems and real-world code are:
Sequence and iteration: len(), range(), enumerate(), zip(), sorted(), reversed(), min()/max() (accept a key= argument), sum(), map(), filter(), any(), all().
Type conversion: int(), float(), str(), bool(), list(), tuple(), set(), dict().
Object introspection: type(), isinstance(), issubclass(), dir(), vars(), hasattr(), getattr(), setattr().
nums = [3, 1, 4, 1, 5, 9, 2, 6]
print(sorted(nums)) # [1, 1, 2, 3, 4, 5, 6, 9]
print(sorted(nums, reverse=True)) # [9, 6, 5, 4, 3, 2, 1, 1]
words = ['banana', 'apple', 'cherry']
print(sorted(words, key=len)) # ['apple', 'banana', 'cherry']
print(any(x > 8 for x in nums)) # True (9 > 8)
print(all(x > 0 for x in nums)) # True (all positive)
# zip — pair two lists
keys = ['a', 'b', 'c']
values = [1, 2, 3 ]
print(dict(zip(keys, values))) # {'a':1,'b':2,'c':3}isinstance(obj, (int, float)) is the right way to check types — it handles subclasses correctly unlike type(obj) == int. For finding the max by a custom criterion: max(employees, key=lambda e: e['salary']).
Every Python object has a boolean value. In a boolean context (an if condition, a while condition, or passed to bool()), Python calls the object's __bool__ method first. If that is not defined, it falls back to __len__ and returns False if __len__ returns 0. If neither is defined, the object is always truthy.
class Queue:
def __init__(self):
self._data = []
def enqueue(self, item):
self._data.append(item)
def __len__(self):
return len(self._data)
def __bool__(self):
return len(self._data) > 0 # explicit
q = Queue()
if not q:
print('Queue is empty') # printed — __bool__ returns False
q.enqueue('item')
if q:
print('Queue has items') # printed — __bool__ returns TrueThe built-in falsy values to memorise: None, False, 0, 0.0, 0j (complex zero), '' (empty string), b'' (empty bytes), [], (), {}, set(), and any object whose __bool__ returns False or whose __len__ returns 0.
Practical impact: you can write Pythonic guards like if items:, while queue:, and return value or default instead of verbose length checks. The short-circuit operators and and or return one of their operands, not necessarily a bool: 'alice' or 'default' returns 'alice'; '' or 'default' returns 'default'.
This distinction matters whenever you have nested or mutable objects and want an independent copy.
An assignment (b = a) creates a second name for the same object — not a copy at all. Mutating b mutates a.
A shallow copy creates a new container object but does not copy the objects inside it — the inner elements are still shared. You get a new list/dict/etc., but any mutable nested objects are referenced, not cloned.
A deep copy recursively copies every object, including nested ones, so the result is completely independent.
import copy
original = [[1, 2], [3, 4]]
# Shallow copy — new outer list, same inner lists
shallow = original.copy() # or list(original) or original[:]
shallow[0].append(99) # mutates the shared inner list!
print(original) # [[1, 2, 99], [3, 4]] — original changed
# Deep copy — new outer AND inner lists
original2 = [[1, 2], [3, 4]]
deep = copy.deepcopy(original2)
deep[0].append(99)
print(original2) # [[1, 2], [3, 4]] — original untouchedWhen to choose each:
- Shallow copy is sufficient when the container holds immutable values (ints, strings, tuples of immutables) or when you intentionally want the copy to share inner objects.
- Deep copy is needed when you want a fully independent snapshot — configuration trees, game states, undo stacks. It is slower and uses more memory.
For dicts, dict.copy() and {**original} are both shallow. The spread operator {**d} is commonly seen in interview code as a one-liner to create a modified copy of a dict without mutating the original.
String manipulation is the backbone of text-based data processing. Python strings are immutable, so every method returns a new string.
raw = ' Hello, World! '
# Trimming whitespace
raw.strip() # 'Hello, World!' — both ends
raw.lstrip() # 'Hello, World! '
raw.rstrip() # ' Hello, World!'
# Case operations
'Python'.lower() # 'python'
'python'.upper() # 'PYTHON'
'hello world'.title() # 'Hello World'
# Splitting and joining
'a,b,c'.split(',') # ['a', 'b', 'c']
' a b c '.split() # ['a', 'b', 'c'] — splits on any whitespace
','.join(['a', 'b', 'c']) # 'a,b,c'
# Checking content
'hello123'.isalpha() # False (has digits)
'hello123'.isalnum() # True
' '.isspace() # True
'hello'.startswith('he') # True
'world'.endswith('ld') # True
# Replacing and finding
'banana'.replace('a', '@') # 'b@n@n@'
'hello world'.find('world') # 6 (-1 if not found)
'hello world'.count('l') # 3For parsing structured text formats, regular expressions (import re) extend beyond what string methods can do. But for simple cleaning — stripping, case-folding, splitting on a fixed delimiter — the built-in methods are faster and more readable than regex. A common data-cleaning pipeline: value.strip().lower().replace('-', '_') in one chained call.
Both functions are loop helpers that eliminate boilerplate index management and make the intent of the code clearer.
enumerate(iterable, start=0) yields (index, value) pairs. Instead of maintaining a counter variable, you unpack it directly in the loop header:
# Non-Pythonic
i = 0
for name in names:
print(i, name)
i += 1
# Pythonic with enumerate
for i, name in enumerate(names, start=1):
print(i, name)zip(*iterables) pairs up elements from two or more iterables by position and stops at the shortest one. It is lazy — returns a zip iterator, not a list.
keys = ['name', 'age', 'city']
values = ['Alice', 30, 'NYC']
for k, v in zip(keys, values):
print(f'{k}: {v}')
# Build a dict from two parallel lists
record = dict(zip(keys, values))
# {'name': 'Alice', 'age': 30, 'city': 'NYC'}
# zip stops at the shortest — use itertools.zip_longest for full coverage
from itertools import zip_longest
for a, b in zip_longest([1, 2, 3], [10, 20], fillvalue=0):
print(a, b) # 1 10 / 2 20 / 3 0Combining both: for i, (k, v) in enumerate(zip(keys, values)): gives you the index and the pair simultaneously. These two functions together eliminate the vast majority of situations where you would otherwise manage index variables manually, and they make code easier to read and harder to get wrong.
Python exceptions form a class hierarchy rooted at BaseException. Most exceptions you deal with inherit from Exception, which itself inherits from BaseException. The hierarchy determines which except clauses match a raised exception — a handler for a parent class catches instances of all child classes.
# BaseException
# ├── SystemExit # sys.exit()
# ├── KeyboardInterrupt # Ctrl-C
# ├── GeneratorExit # generator.close()
# └── Exception # all regular exceptions
# ├── ValueError
# ├── TypeError
# ├── AttributeError
# ├── KeyError
# ├── IndexError
# ├── RuntimeError
# │ └── RecursionError
# ├── OSError
# │ ├── FileNotFoundError
# │ └── PermissionError
# └── ArithmeticError
# └── ZeroDivisionErrorCreating custom exceptions is simple — subclass Exception (or a more specific built-in) and optionally add an __init__ for structured error data:
class InsufficientFundsError(ValueError):
def __init__(self, balance, amount):
self.balance = balance
self.amount = amount
super().__init__(
f'Cannot withdraw {amount}; balance is only {balance}')
def withdraw(account, amount):
if amount > account.balance:
raise InsufficientFundsError(account.balance, amount)
account.balance -= amount
try:
withdraw(acc, 9999)
except InsufficientFundsError as e:
print(e) # Cannot withdraw 9999; balance is only 1500
print(e.amount) # 9999 — structured access
The walrus operator (:=), introduced in Python 3.8 (PEP 572), is the assignment expression operator. It assigns a value to a variable as part of a larger expression rather than as a standalone statement. The name comes from its resemblance to a walrus face with tusks.
# Without walrus — evaluate twice
data = fetch_data()
if data:
process(data)
# With walrus — evaluate once, assign, and test in one expression
if data := fetch_data():
process(data)
# Classic use: while loop reading chunks from a file
with open('large.bin', 'rb') as f:
while chunk := f.read(8192):
process_chunk(chunk)
# Filtering with a computed value — avoid calling the function twice
results = [cleaned for raw in records
if (cleaned := clean(raw)) is not None]The walrus operator is most valuable when you need to compute a value, test it, and use it — and calling the computation twice would be wasteful or have side effects. Common patterns: while loops reading from streams, filtering list comprehensions where the filter function is expensive, and reducing nested if-statements.
Avoid overusing it — plain assignment on a separate line is often more readable. The walrus is idiomatic in tight loops and comprehensions; in most other code the conventional two-step (assign then test) is clearer.
Python has two primary ways to sort: the list method list.sort() and the built-in function sorted(). Both use the Timsort algorithm (a hybrid of merge sort and insertion sort) with O(n log n) worst-case complexity, and both accept key= and reverse= arguments.
nums = [5, 2, 8, 1, 9]
# sort() — in-place, returns None, only on lists
nums.sort()
print(nums) # [1, 2, 5, 8, 9] — original modified
# sorted() — returns a new list, works on any iterable
original = (5, 2, 8, 1, 9) # tuple
result = sorted(original) # [1, 2, 5, 8, 9] — new list
print(original) # (5, 2, 8, 1, 9) — unchanged
# Sorting complex objects
products = [
{'name': 'Widget', 'price': 9.99, 'stock': 100},
{'name': 'Gadget', 'price': 4.99, 'stock': 250},
{'name': 'Doohickey', 'price': 14.99, 'stock': 30},
]
by_price = sorted(products, key=lambda p: p['price'])
by_stock_desc = sorted(products, key=lambda p: p['stock'], reverse=True)
# Multi-key sort: first by stock descending, then by name ascending
from operator import itemgetter
multi = sorted(products, key=lambda p: (-p['stock'], p['name']))Timsort is stable — equal elements preserve their original relative order. This property makes multi-key sorting straightforward: sort by secondary key first, then by primary key.
The operator.itemgetter and operator.attrgetter functions from the operator module are faster alternatives to lambdas for simple key extraction, especially in tight loops on large datasets.
@dataclass (introduced in Python 3.7, PEP 557) is a class decorator that auto-generates boilerplate methods — __init__, __repr__, and __eq__ — from class-level field annotations. It removes the tedium of writing identical initialisation code for data-holding classes.
from dataclasses import dataclass, field
@dataclass
class Product:
name: str
price: float
tags: list = field(default_factory=list) # mutable default
in_stock: bool = True
p = Product('Widget', 9.99, ['sale', 'new'])
print(p) # Product(name='Widget', price=9.99, tags=['sale', 'new'], in_stock=True)
print(p == Product('Widget', 9.99, ['sale', 'new'])) # True — __eq__ generated
# Frozen (immutable) dataclass — useful as dict key
@dataclass(frozen=True)
class Point:
x: float
y: float
pt = Point(1.0, 2.0)
print(hash(pt)) # hashable because frozenUse field(default_factory=list) for mutable defaults — the same reason you use None in regular functions; if you wrote tags: list = [] in a dataclass the annotation system handles it safely (unlike regular class attributes), but field(default_factory=list) is the explicit, recommended way.
Dataclasses are the right choice for plain data containers: API response models, configuration objects, records. For complex logic with many methods, regular classes are cleaner. For fully immutable value objects, frozen=True is the quick path. For validation and serialisation, libraries like Pydantic build on the dataclass concept and add runtime type checking.
A context manager controls setup and teardown around a block of code via the with statement. The canonical example is file handling, but context managers are used for database transactions, locking, temporary directory creation, patching in tests, and any resource that needs guaranteed cleanup.
Python calls __enter__ when entering the with block and __exit__ when leaving it — even if an exception is raised. The value returned by __enter__ is bound to the as variable.
class Timer:
import time
def __enter__(self):
self._start = self.time.perf_counter()
return self # bound to 'as t'
def __exit__(self, exc_type, exc_val, exc_tb):
self.elapsed = self.time.perf_counter() - self._start
print(f'Elapsed: {self.elapsed:.4f}s')
return False # False = do not suppress exceptions
with Timer() as t:
result = sum(range(1_000_000))
print(t.elapsed)The simpler way for most cases is contextlib.contextmanager, which turns a generator function into a context manager — everything before yield is setup, everything after is teardown:
from contextlib import contextmanager
@contextmanager
def managed_connection(dsn):
conn = connect(dsn)
try:
yield conn # the value of 'conn' in 'with ... as conn'
finally:
conn.close() # runs even if an exception occurred
Type hints (PEP 484, Python 3.5+) let you annotate variables, function parameters, and return values with expected types. They are completely ignored at runtime by the interpreter but can be checked statically by tools like mypy, pyright, and IDE analysers, catching type errors before code ever runs.
def calculate_discount(price: float, pct: float) -> float:
"""Return the discounted price."""
return price * (1 - pct / 100)
# Variable annotations
name: str = 'Alice'
items: list[int] = []
# Optional — value may be the type or None
from typing import Optional
def find_user(uid: int) -> Optional[dict]:
... # returns dict or None
# Union type (Python 3.10+ shorthand: str | int)
from typing import Union
def parse(value: Union[str, int]) -> str:
return str(value)
# Python 3.10+ shorthand
def parse310(value: str | int) -> str:
return str(value)
# List, Dict, Tuple from typing (3.9+ can use built-ins directly)
from typing import List, Dict, Tuple
def process(records: List[Dict[str, int]]) -> Tuple[int, int]:
...From Python 3.9, you can use built-in collection types directly in annotations: list[int], dict[str, float], tuple[int, str] — no import from typing needed. From 3.10, X | Y replaces Union[X, Y]. Running mypy --strict script.py treats all un-annotated parameters as errors, giving you full type safety.
There is no tuple comprehension syntax in Python — (x for x in range(5)) is a generator expression, not a tuple. To get a tuple from a comprehension-like construct, wrap a generator expression in tuple():
# Generator expression — lazy, single-pass, no tuple
gen = (x**2 for x in range(5))
print(type(gen)) #
# Tuple from a generator expression
t = tuple(x**2 for x in range(5))
print(t) # (0, 1, 4, 9, 16)
print(type(t)) #
# List comprehension — eager, stored in memory
lst = [x**2 for x in range(5)]
# Generator expression used inline — no intermediate list
total = sum(x**2 for x in range(1_000_000)) # memory-efficient Generator expressions are the memory-friendly alternative to list comprehensions when you only need to iterate once or pass the result to a function that accepts an iterable (like sum(), max(), sorted(), list()). They are lazy — values are generated on demand.
When should you prefer a generator expression over a list comprehension?
- The sequence will be consumed once, not indexed or iterated multiple times.
- The sequence is large and you cannot afford to materialise it all in memory.
- You are passing it directly to an aggregation function (
sum,any,all).
Nested generator expressions are possible but hard to read — more than one level of nesting is usually a sign to extract a helper function.
A recursive function calls itself to break a problem into smaller sub-problems of the same kind. Every recursive function needs a base case that stops the recursion and a recursive case that moves toward the base case.
def factorial(n):
if n <= 1: # base case
return 1
return n * factorial(n - 1) # recursive case
print(factorial(5)) # 120
# Fibonacci with memoisation to avoid exponential time
from functools import lru_cache
@lru_cache(maxsize=None)
def fib(n):
if n < 2:
return n
return fib(n - 1) + fib(n - 2)
print(fib(50)) # runs instantly with cachingPython's default recursion limit is 1000 frames (check with sys.getrecursionlimit(); change with sys.setrecursionlimit(), though increasing it past ~5000 risks a segfault). Exceeding it raises RecursionError. Python does not perform tail-call optimisation — each call adds a frame to the call stack regardless.
For problems where iteration is naturally iterative (Fibonacci, factorial), recursion adds overhead and a stack risk. For inherently recursive structures — trees, nested directories, JSON with arbitrary nesting — recursion is often the clearest approach. When a recursive solution would run into the stack limit, convert to an explicit stack using a list: push children, pop and process.
Python's built-in json module converts between JSON strings/files and Python objects. The mapping is: JSON object ↔ Python dict, JSON array ↔ Python list, JSON string ↔ Python str, JSON number ↔ Python int/float, JSON true/false ↔ Python True/False, JSON null ↔ Python None.
import json
# --- Parsing (deserialising) ---
# From a JSON string (e.g., API response body)
json_str = '{"name": "Alice", "age": 30, "tags": ["admin", "user"]}'
data = json.loads(json_str) # loads = load string
print(data['name']) # Alice
print(data['tags'][0]) # admin
# From a file
with open('config.json', encoding='utf-8') as f:
config = json.load(f) # load (no s) = load file
# --- Building (serialising) ---
payload = {'user_id': 42, 'scores': [95, 87, 100], 'active': True}
# To a string
json_out = json.dumps(payload, indent=2) # pretty-printed
print(json_out)
# To a file
with open('output.json', 'w', encoding='utf-8') as f:
json.dump(payload, f, indent=2)
# Handling dates — not natively serialisable
from datetime import date
json.dumps({'date': date.today().isoformat()}) # convert to string firstCommon pitfalls: JSON keys are always strings, but Python dicts can have non-string keys — serialising a dict with int keys will convert them to strings, which can break round-trip assumptions. Python's datetime, Decimal, and custom objects are not JSON-serialisable by default — you must either convert them before serialising or provide a custom default= encoder function to json.dumps().
Using print() for diagnostics is fine during quick development, but it has serious limitations in any real-world application: output always goes to stdout, there is no severity level, you cannot turn it off without editing code, and there is no timestamp, file name, or line number.
Python's logging module solves all of these. It provides five severity levels in ascending order: DEBUG, INFO, WARNING, ERROR, CRITICAL. You set a threshold and only messages at or above that level are emitted.
import logging
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s %(levelname)-8s %(name)s: %(message)s',
handlers=[
logging.FileHandler('app.log'),
logging.StreamHandler() # also print to console
]
)
logger = logging.getLogger(__name__) # module-level logger
def process_order(order_id):
logger.debug('Processing order %s', order_id)
try:
result = fulfil(order_id)
logger.info('Order %s fulfilled', order_id)
return result
except TimeoutError:
logger.error('Timeout processing order %s', order_id, exc_info=True)
raiseKey advantages over print: severity levels let you turn debug output off in production by raising the log level to WARNING. Named loggers (logging.getLogger(__name__)) let library authors log without polluting application output — consumers can configure whether to see library logs. exc_info=True automatically includes the traceback. Handlers route log records to files, external services, or email without touching the application logic.
PEP 8 is Python's official style guide, written by Guido van Rossum, Barry Warsaw, and Nick Coghlan. It defines conventions for formatting Python code so that all Python code looks consistent and is easier to read and review.
The most frequently tested conventions:
Indentation: 4 spaces per level. Never tabs. (The standard library itself mandates 4 spaces; mixing tabs and spaces causes TabError in Python 3.)
Line length: Maximum 79 characters for code, 72 for docstrings and comments. Most teams now accept 88 or 99 characters when using the Black formatter.
Naming conventions:
snake_case # variables, functions, module names
SCREAMING_SNAKE # module-level constants
PascalCase # class names
_single_leading # private by convention (not enforced)
__double_leading # name-mangled in classes (avoid unless needed)
__dunder__ # reserved for Python internals — don't invent new onesWhitespace rules: one space around binary operators (x = y + z), no space before a colon in a slice (data[1:3]), two blank lines between top-level definitions, one blank line between methods inside a class.
Imports: standard library first, then third-party, then local — each group separated by a blank line. Absolute imports preferred over relative. One import per line.
Tools that enforce PEP 8 automatically: pycodestyle (checks), autopep8 (fixes), flake8 (checks + extra rules), black (opinionated auto-formatter). In interviews, knowing that you use a linter or formatter signals professional habits.
