Some Intermediate Python Knowledge

January 29, 2025

I realized that even after coding a lot in python, there's a lot of intermediate python knowledge that isn't concrete in my head. So this page tries to summarize all the more common python quirks into one doc.

This doc will (hope to) cover the following topics:

Language features:

  • Decorators
  • Various dunder methods
  • Generators
  • Context managers
  • Positional and keyword arguments
  • More advanced OOP

Other stuff:

  • Python data structure implementations
  • Garbage collection and memory management
  • Concurrency
  • Packaging ecosystem

Decorators

Here are some example of decorators in use in python:

# @staticmethod
# defines a method that does not take self or cls
class Math:
    @staticmethod
    def add(x, y):
        return x + y
Math.add(3, 5)

# @classmethod
# defines a method in a class that takes cls instead of self
class Book:
    count = 0 # global count
    @classmethod
    def get_count(cls):
        return cls.count
Book.get_count()

# pytest example
@pytest.mark.parametrize("a, b, expected", [(1, 2, 3), (3, 5, 8)])
def test_add(a, b, expected):
    assert a + b == expected

Decorators essentially annotation functions that modifies its behavior, here is how to create a decorator.

def my_decorator(func):
    def wrapper(*args, **kwargs):
        // do things before func call
        result = func(*args, **kwargs)
        // do things after
        return result
    return wrapper

@my_decorator
def func():
    ...

# note that @my_decorator is just syntatic sugar for my_decorator(func)

Various dunder methods

These are methods automatically invoked by python in certain operations

Object initialize and representation (very important):

  • __init__ : constructor, initialize a new object
  • __new__ : Controls object creation before __init__
  • __del__ : Destructor, called when an object is deleted (using del keyword)
  • __repr__ : Returns a string representation for an object (it's more supposed to be an unambiguous representation, called with repr(obj))
  • __str__ : Returns a string representation for reading (defaultly used when printing)
  • __bytes__ : Converts object to bytes

Comparison and stuff:

  • __eq__, __ne__, __lt__, __le__, etc: overloading ==, !=, <, <=, etc
  • __bool__: truthiness of object
  • __hash__: computes hash of object

Callable and context managers (more on this later)

  • __call__ : make an object callable
  • __enter__: used in with obj: (context manager)
  • __exit__ : Helps cleanup in with statements

Descriptors (kinda useless, ive never used it before): Overrides instance dictionary when accessed

  • __get__(self, instance, owner)
  • __set__
  • __delete__

Attribute access (also very rarely used):

  • __getattr__
  • __setattr__
  • __delattr__
  • __dir__
  • __getattribute__

Generator Functions

In a function, instead of returning all of the return values at once, you can use a generator function to "yield" one value at a time.

Context managers

A construct in python that allows you to manage resources by setting it up and then cleaning up, using the with statement

How to write one

class MyContextManager:
    def __enter__(self):
        return self # value returned is assigned to 'as' variable
    
    def __exit__(self, exc_type, exc_value, traceback):
        return True

or more commonly

from contextlib import contextmanager

@contextmanager
def my_context():
    print("Entering context")
    try:
        yield "Hello"
    finally:
        print("Exiting context")

with my_context() as value:
    print("Inside context:", value)

Some examples

  • with open(file) as f
  • with lock (for threading)
  • with torch.no_grad():

Positional and keyword arguments

*args and **kwargs are used in python function definitions to allow flexible arguments

  • *args is known as positional arguments, it basically accepts any number of positional arguments as a tuple
  • **kwargs is known as keyword arguments, it accepts any number of keyword arguments and is stored in a dictionary

If you want to use both regular positional arguments and keyword arguments along with *args and **kwargs, you must follow this structures

def function(pos1, pos2, *args, kw1, kw2, **kwargs): pass
# note that they don't have to be named args and kwargs, thats just a popular naming convention

Python data structures, and their inner implementations

  • List: just a simple dynamic array
  • Tuple: Immutable array
  • Set: hash table, similar to dictionary but without values
  • Dict: hash table

Hashing is implemented with open addressing (In which if a slot is not available, it would probe for the next slot. The other method is chaining where you use an array/linkedlist for an entry) and perturbation probing

  • Str: Immutable array (when you concatenate, you create a new string)

  • Deque: Double linked list
  • Heap: Binary min-heap
  • Counter: same as dictionary but for counting

Less common keywords

  • yield: Used in generator functions to return values lazily.
  • async / await: async defines a function that is asynchronous, await is used for calling an async function
  • nonlocal: used inside of nested functions to modify variable from outer, non-global scope
  • global: declares a variable as global inside a function
  • del: delete anything from memory (more on this later)
  • try/except/finally: finally just always occurs
  • @: apply decorators
  • exec: more of a function, but executes a string of Python code
  • breakpoint: you can just add a breakpoint with a keyword woah

(i'm not sure why mdx colors some of the keywords and not others)



(to be added topics)

Python packaging ecosystem

  • pip, setup.py, pyproject.toml, wheels

Garbage collection and memory management

Concurrency and parallelism

  • GIL

Advanced OOP

Registry design pattern