I realized that even after coding a lot in python, there's a lot of intermediate python knowledge that isn't concrete in my head. So this page tries to summarize all the more common python quirks into one doc.
This doc will (hope to) cover the following topics:
Language features:
- Decorators
- Various dunder methods
- Generators
- Context managers
- Positional and keyword arguments
- More advanced OOP
Other stuff:
- Python data structure implementations
- Garbage collection and memory management
- Concurrency
- Packaging ecosystem
Decorators
Here are some example of decorators in use in python:
# @staticmethod
# defines a method that does not take self or cls
class Math:
@staticmethod
def add(x, y):
return x + y
Math.add(3, 5)
# @classmethod
# defines a method in a class that takes cls instead of self
class Book:
count = 0 # global count
@classmethod
def get_count(cls):
return cls.count
Book.get_count()
# pytest example
@pytest.mark.parametrize("a, b, expected", [(1, 2, 3), (3, 5, 8)])
def test_add(a, b, expected):
assert a + b == expected
Decorators essentially annotation functions that modifies its behavior, here is how to create a decorator.
def my_decorator(func):
def wrapper(*args, **kwargs):
// do things before func call
result = func(*args, **kwargs)
// do things after
return result
return wrapper
@my_decorator
def func():
...
# note that @my_decorator is just syntatic sugar for my_decorator(func)
Various dunder methods
These are methods automatically invoked by python in certain operations
Object initialize and representation (very important):
__init__
: constructor, initialize a new object__new__
: Controls object creation before__init__
__del__
: Destructor, called when an object is deleted (using del keyword)__repr__
: Returns a string representation for an object (it's more supposed to be an unambiguous representation, called withrepr(obj)
)__str__
: Returns a string representation for reading (defaultly used when printing)__bytes__
: Converts object to bytes
Comparison and stuff:
__eq__
,__ne__
,__lt__
,__le__
, etc: overloading==
,!=
,<
,<=
, etc__bool__
: truthiness of object__hash__
: computes hash of object
Callable and context managers (more on this later)
__call__
: make an object callable__enter__
: used inwith obj:
(context manager)__exit__
: Helps cleanup inwith
statements
Descriptors (kinda useless, ive never used it before): Overrides instance dictionary when accessed
__get__
(self, instance, owner)__set__
__delete__
Attribute access (also very rarely used):
__getattr__
__setattr__
__delattr__
__dir__
__getattribute__
Generator Functions
In a function, instead of returning all of the return values at once, you can use a generator function to "yield" one value at a time.
Context managers
A construct in python that allows you to manage resources by setting it up and then cleaning up, using the with
statement
How to write one
class MyContextManager:
def __enter__(self):
return self # value returned is assigned to 'as' variable
def __exit__(self, exc_type, exc_value, traceback):
return True
or more commonly
from contextlib import contextmanager
@contextmanager
def my_context():
print("Entering context")
try:
yield "Hello"
finally:
print("Exiting context")
with my_context() as value:
print("Inside context:", value)
Some examples
with open(file) as f
with lock
(for threading)with torch.no_grad():
Positional and keyword arguments
*args and **kwargs are used in python function definitions to allow flexible arguments
- *args is known as positional arguments, it basically accepts any number of positional arguments as a tuple
- **kwargs is known as keyword arguments, it accepts any number of keyword arguments and is stored in a dictionary
If you want to use both regular positional arguments and keyword arguments along with *args and **kwargs, you must follow this structures
def function(pos1, pos2, *args, kw1, kw2, **kwargs): pass
# note that they don't have to be named args and kwargs, thats just a popular naming convention
Python data structures, and their inner implementations
- List: just a simple dynamic array
- Tuple: Immutable array
- Set: hash table, similar to dictionary but without values
- Dict: hash table
Hashing is implemented with open addressing (In which if a slot is not available, it would probe for the next slot. The other method is chaining where you use an array/linkedlist for an entry) and perturbation probing
- Str: Immutable array (when you concatenate, you create a new string)
- Deque: Double linked list
- Heap: Binary min-heap
- Counter: same as dictionary but for counting
Less common keywords
yield
: Used in generator functions to return values lazily.async
/await
: async defines a function that is asynchronous, await is used for calling an async functionnonlocal
: used inside of nested functions to modify variable from outer, non-global scopeglobal
: declares a variable as global inside a functiondel
: delete anything from memory (more on this later)try
/except
/finally
: finally just always occurs@
: apply decoratorsexec
: more of a function, but executes a string of Python codebreakpoint
: you can just add a breakpoint with a keyword woah
(i'm not sure why mdx colors some of the keywords and not others)
(to be added topics)
Python packaging ecosystem
- pip, setup.py, pyproject.toml, wheels
Garbage collection and memory management
Concurrency and parallelism
- GIL