Python Memory Management
All allocating functions belong to one of the three allocation domains:
Raw domain (
PYMEM_DOMAIN_RAW
): Intended for allocating memory for general-purpose memory buffers where the allocation must go to the system allocator or where the allocator can operate without the GIL. The memory is requested directly to the system.Object domain (
PYMEM_DOMAIN_OBJ
): Intended for allocating memory belonging to Python objects. The memory is taken from the Python private heap."Mem" domain (
PYMEM_DOMAIN_MEM
): Intended for allocating memory for Python buffers and general-purpose memory buffers where the allocation must be performed with the GIL held.
Allocation interface:
All allocation domains implement _Alloc(size_t size)
, _Calloc(size_t nelem, size_t elsize)
, _Realloc(void *ptr, size_t new_size)
and _Free(void *ptr)
.
Allocators used:
Release build
"pymalloc"
malloc
pymalloc
pymalloc
Debug build
"pymalloc_debug"
malloc
+ debug
pymalloc
+ debug
pymalloc
+ debug
Release build, without pymalloc
"malloc"
malloc
malloc
malloc
Debug build, without pymalloc
"malloc_debug"
malloc
+ debug
malloc
+ debug
malloc
+ debug
The CPython Memory Allocator (pymalloc
)
pymalloc
)The CPython allocator is similar to the system allocator but differs in the following ways:
It has a fixed size, handling requests up to 256 KiB on 32-bit platforms and 1MiB on 64-bit platforms. Requests exceeding 256KB are managed by the system allocator.
It utilizes the Global Interpreter Lock (GIL) for thread safety instead of the system's thread-safety mechanisms.
It allocates by
mmap
rather than heap allocation.
Arenas
CPython creates arenas of 256 KiB on 32-bit platforms and 1MiB on 64-bit platforms to align with the system page size. Arenas are allocated against the system heap with mmap()
on systems supporting anonymous memory mappings.
Fields and Purposes of `arenaobject`
address
uintptr_t
Memory address of the arena
pool_address
block *
Pointer to the next pool to be carved off for allocation
nfreepools
uint
The number of available pools in the arena (free pools plus never-allocated pools)
ntotalpools
uint
The total number of pools in the arena, whether or not available
freepools
pool_header *
Singly linked list of available pools
nextarena
arena_object *
Next arena
prevarena
arena_object *
Previous arena
If the arena is unallocated, the
nextarena
field is used to link all the unallocated arenas into a singly linked list, maintained by the global variableunused_arena_objects
.If the arena is allocated, both the
nextarena
andprevarena
fields are used to link the arena into a doubly linked list calledusable_arenas
. This list is maintained in increasing order ofnfreepools
values.
Pools
Pools are created for block sizes up to 512 bytes. For a given request, the size of the block issued is rounded up to bytes in 32-bit systems and bytes in 64-bit systems, where is the size class index.
When no available pools are available for the requested size class index in a arena, a new one is provisioned. Arenas have a high-water mark to index how many pools have been provisioned. The high-water mark sits at the last allocated pool.
Pools have three possible states: Full, Used and Empty.
Fields and Purposes of pool_header
ref
uint
Number of currently allocated blocks in this pool
freeblock
block *
Pointer to this pool’s free list head
nextpool
pool_header *
Pointer to the next pool of this size class
prevpool
pool_header *
Pointer to the previous pool of this size class
arenaindex
uint
Singly-linked list of available pools
szidx
uint
Size class index of this pool
nextoffset
uint
Number of bytes to unused block
maxnextoffset
uint
Maximum number that nextoffset can be until pool is full
Pools of the same szidx
are linked into a doubly linked list.
Pool tables: For an index of i
,usedpools[i + i]
points to the header of a list of all partially used pools that have the size index for that size class.
When a pool becomes full, it’s unlinked from its
usedpools[]
list.If a full pool has a block freed, then the pool back is put back to the front of
usedpools[]
.On transition to empty, a pool is unlinked from its
usedpools[]
list and linked to the front of its arena’s singly linkedfreepools
list.
Blocks
Within a pool, blocks of fixed size class can be allocated and freed.
Available blocks within a pool are listed in the singly linked list
freeblock
. When a block is freed, it’s inserted at the front of thefreeblock
list.When a pool is initialized, only the first two blocks are linked within the
freeblock
list.As long a pool is in the used state, there will be a block available for allocating.
Block Allocation API
When a block of memory is requested by a memory domain that uses pymalloc, pymalloc_alloc(void *ctx, size_t nbytes)
is called.
Using the Python Debug API
sys._debugmallocstats()
to get:
The number of blocks in use for each of the size class pools.
The number of arenas allocated and reclaimed along with the total number of blocks used.
The Object and PyMem Memory Allocation Domains
Beyond Python objects, the object allocator is also used for compiler, AST, parser, and evaluation loop.
An excellent example of the object memory allocator in use is the PyLongObject (int)
type constructor, PyLong_New()
:
When a new int is constructed, memory is allocated from the object allocator.
The size of the request is the size of the
PyLongObject
struct plus the amount of memory required to store the digits.
[!note]
The number 12378562834 in Python would be represented as the list of digits
[1,2,3,7,8,5,6,2,8,3,4]
, which is different from C.
The definition of PyLongObject
can be expanded as follows:
Therefore, the size of the memory requested includes the overhead of ob_base
and lv_tag
, as well as the actual content of the digits.
Last updated