Python3源码分析
本文环境python3.5.2。
参考书籍<<Python源码剖析>>
python官网
Python3的内存管理概述
python提供了对内存的垃圾收集机制,内存管理相关的函数基本位于Objects/obmalloc.c和Modules/gcmodule.c模块中,其中obmalloc.c中主要是管理Python在运行的时候所申请的内存块的管理,gcmodule.c中主要是为了解决在使用引用计数机制的循环引用的缺陷,引用了清除和分代收集来填补该缺陷。首先来分析obmalloc.c来了解Python是如何管理内存的。
Python3的内存管理
在Python中,内存管理机制分成了一个层次结构,如下所述,
Object-specific allocators
_____ ______ ______ ________
[ int ] [ dict ] [ list ] ... [ string ] Python core |
+3 | <----- Object-specific memory -----> | <-- Non-object memory --> |
_______________________________ | |
[ Python's object allocator ] | |
+2 | ####### Object memory ####### | <------ Internal buffers ------> |
______________________________________________________________ |
[ Python's raw memory allocator (PyMem_ API) ] |
+1 | <----- Python memory (under PyMem manager's control) ------> | |
__________________________________________________________________
[ Underlying general-purpose allocator (ex: C library malloc) ]
0 | <------ Virtual memory allocated for the python process -------> |
=========================================================================
_______________________________________________________________________
[ OS-specific Virtual Memory Manager (VMM) ]
-1 | <--- Kernel dynamic storage allocation & management (page-based) ---> |
__________________________________ __________________________________
[ ] [ ]
-2 | <-- Physical memory: ROM/RAM --> | | <-- Secondary storage (swap) --> |
该层次结构,清晰的说明了,从物理内存,到虚拟地址,再到调用c接口获取内存,然后通过Python的API调用,获取分配的内存,然后通过Python自己管理获取的内存。位于第一层的函数基本就是Py_Mem开头的API,
PyAPI_FUNC(void *) PyMem_Malloc(size_t size);
PyAPI_FUNC(void *) PyMem_Calloc(size_t nelem, size_t elsize);
PyAPI_FUNC(void *) PyMem_Realloc(void *ptr, size_t new_size);
PyAPI_FUNC(void) PyMem_Free(void *ptr);
举例其中PyMem_Malloc函数来说,该函数的实际定义如下,
#define PYRAW_FUNCS _PyMem_RawMalloc, _PyMem_RawCalloc, _PyMem_RawRealloc, _PyMem_RawFree
...
#define PYMEM_FUNCS PYRAW_FUNCS
...
static PyMemAllocatorEx _PyMem = {
#ifdef PYMALLOC_DEBUG
&_PyMem_Debug.mem, PYDBG_FUNCS
#else
NULL, PYMEM_FUNCS
#endif
};
...
void *
PyMem_Malloc(size_t size)
{
/* see PyMem_RawMalloc() */
if (size > (size_t)PY_SSIZE_T_MAX)
return NULL;
return _PyMem.malloc(_PyMem.ctx, size);
}
其中PyMemAllocatorEx的定义如下,
typedef struct {
/* user context passed as the first argument to the 4 functions */
void *ctx;
/* allocate a memory block */
void* (*malloc) (void *ctx, size_t size);
/* allocate a memory block initialized by zeros */
void* (*calloc) (void *ctx, size_t nelem, size_t elsize);
/* allocate or resize a memory block */
void* (*realloc) (void *ctx, void *ptr, size_t new_size);
/* release a memory block */
void (*free) (void *ctx, void *ptr);
} PyMemAllocatorEx;
所以_PyMem.malloc调用的就是_PyMem_RawMalloc方法,
static void *
_PyMem_RawMalloc(void *ctx, size_t size)
{
/* PyMem_RawMalloc(0) means malloc(1). Some systems would return NULL
for malloc(0), which would be treated as an error. Some platforms would
return a pointer with no memory behind it, which would break pymalloc.
To solve these problems, allocate an extra byte. */
if (size == 0)
size = 1;
return malloc(size);
}
通过该方法可知,调用的就是底层的malloc方法,所以分配内存也是调用了底层的malloc方法申请内存,至此开始分析第二层以上的内存管理方法。
小块空间的内存
在Python的运行过程中,申请的内存块大部分都是小块的内存,申请在使用完成后又会很快释放掉,如果直接申请或者释放内存的话此时就是大量的执行malloc和free的操作,大量的系统调用会影响到Python的执行性能,为了提供性能,Python引入了内存池的机制,用于管理对小块内存的申请和释放,在这个层次结构中,分为四层,从下至上依次是block、pool、arena和内存池,其中block、pool和arena是有实体结构对应的,而内存池仅是管理概念上的内容。
块
Block(块)是一个确定大小的内存块,一般都是定义为2的N次方,一般都是八字节对齐,
/*
* Alignment of addresses returned to the user. 8-bytes alignment works
* on most current architectures (with 32-bit or 64-bit address busses).
* The alignment value is also used for grouping small requests in size
* classes spaced ALIGNMENT bytes apart.
*
* You shouldn't change this unless you know what you are doing.
*/
#define ALIGNMENT 8 /* must be 2^N */
#define ALIGNMENT_SHIFT 3
如注释所述,在大部分主流的结构上8字节性能较好,并且也为Python申请的block的大小设定了一个上限,当申请的内存大于这个上线时则直接申请不经过Python的内存管理,当小于这个大小时,则使用不同种类的block来满足内存的需求,
/*
* Max size threshold below which malloc requests are considered to be
* small enough in order to use preallocated memory pools. You can tune
* this value according to your application behaviour and memory needs.
*
* Note: a size threshold of 512 guarantees that newly created dictionaries
* will be allocated from preallocated memory pools on 64-bit.
*
* The following invariants must hold:
* 1) ALIGNMENT <= SMALL_REQUEST_THRESHOLD <= 512
* 2) SMALL_REQUEST_THRESHOLD is evenly divisible by ALIGNMENT
*
* Although not required, for better performance and space efficiency,
* it is recommended that SMALL_REQUEST_THRESHOLD is set to a power of 2.
*/
#define SMALL_REQUEST_THRESHOLD 512
#define NB_SMALL_SIZE_CLASSES (SMALL_REQUEST_THRESHOLD / ALIGNMENT)
并且规定了申请多大的内存该怎样分配内存大小,
* For small requests we have the following table:
*
* Request in bytes Size of allocated block Size class idx
* ----------------------------------------------------------------
* 1-8 8 0
* 9-16 16 1
* 17-24 24 2
* 25-32 32 3
* 33-40 40 4
* 41-48 48 5
* 49-56 56 6
* 57-64 64 7
* 65-72 72 8
* ... ... ...
* 497-504 504 62
* 505-512 512 63
*
* 0, SMALL_REQUEST_THRESHOLD + 1 and up: routed to the underlying
* allocator.
*/
如表所示,如果申请了28个字节大小的空间,会被分配32个字节大小,从size class index 为3的pool中划分出来,该字节大小的转换通过如下的方式转换,
/* Return the number of bytes in size class I, as a uint. */
#define INDEX2SIZE(I) (((uint)(I) + 1) << ALIGNMENT_SHIFT)
...
size = (uint)(nbytes - 1) >> ALIGNMENT_SHIFT;
块的基本属性就分析完成,接下来就分析Pool。
Pool
一组block的集合称为一个pool,一个pool管理着一组有固定大小的内存块,通常情况下定义的pool的大小就是一个内存页的大小,并且目前主要的系统的内存页大小基本都是4KB,所以pool的大小也定义为4KB。
#define SYSTEM_PAGE_SIZE (4 * 1024)
#define SYSTEM_PAGE_SIZE_MASK (SYSTEM_PAGE_SIZE - 1)
...
#define POOL_SIZE SYSTEM_PAGE_SIZE /* must be 2^N */
#define POOL_SIZE_MASK SYSTEM_PAGE_SIZE_MASK
并且相关的数据结构如下,
/* Pool for small blocks. */
struct pool_header {
union { block *_padding;
uint count; } ref; /* number of allocated blocks */
block *freeblock; /* pool's free list head */ // 空闲的指针
struct pool_header *nextpool; /* next pool of this size class */ // 下一个pool
struct pool_header *prevpool; /* previous pool "" */ // 上一个pool
uint arenaindex; /* index into arenas of base adr */
uint szidx; /* block size class index */ // 块大小
uint nextoffset; /* bytes to virgin block */
uint maxnextoffset; /* largest valid nextoffset */ // 最大空间
};
可知,在获取4KB内存空间后,其中还包括了pool_header的头部数据,所以最大可用的空间小于4KB,并且一个pool只能管理一个block大小的块,例如,一个pool只能管理32字节的block,不能同时管理32字节的block和64字节的block。
情况1:新申请的pool连续申请内存
假如现在连续申请内存32字节内存块5个,此时就会执行如下代码,
static void *
_PyObject_Alloc(int use_calloc, void *ctx, size_t nelem, size_t elsize)
{
...
/*
* Reached the end of the free list, try to extend it.
*/
if (pool->nextoffset <= pool->maxnextoffset) { // 如果有足够的空间
/* There is room for another block. */
pool->freeblock = (block*)pool +
pool->nextoffset; // 设置freeblock指向下一个空闲的地址
pool->nextoffset += INDEX2SIZE(size); // 调整nextoffset指向的地址
*(block **)(pool->freeblock) = NULL; // 设置freeblock为空
UNLOCK();
if (use_calloc)
memset(bp, 0, nbytes);
return (void *)bp;
}
...
}
此时的内存指向的空闲指针依次往后移动,空余指针依次递增往后移动。
情况2:释放pool中的某块并在释放后再申请
此时假如要释放block2和block4,此时会执行到如下流程,
static void
_PyObject_Free(void *ctx, void *p)
{
...
pool = POOL_ADDR(p);
if (Py_ADDRESS_IN_RANGE(p, pool)) { // 判断p指向的是否属于pool
/* We allocated this address. */
LOCK();
/* Link p to the start of the pool's freeblock list. Since
* the pool had at least the p block outstanding, the pool
* wasn't empty (so it's already in a usedpools[] list, or
* was full and is in no list -- it's not in the freeblocks
* list in any case).
*/
assert(pool->ref.count > 0); /* else it was empty */
*(block **)p = lastfree = pool->freeblock; // 获取freeblock值将地址设置到p中存入
pool->freeblock = (block *)p;
// freeblock指向被释放的地址
...
}
...
}
此时会将当前的freeblock的地址作为值存入p指向的内容中,然后将p的地址赋值给pool->freeblock,此时的内存图如下所示,
此时,在释放block2和block4后,然后再申请内存此时会执行如下,
static void *
_PyObject_Alloc(int use_calloc, void *ctx, size_t nelem, size_t elsize)
{
...
if (pool != pool->nextpool) {
/*
* There is a used pool for this size class.
* Pick up the head block of its free list.
*/
++pool->ref.count;
bp = pool->freeblock; // 获取freeblock的地址
assert(bp != NULL);
if ((pool->freeblock = *(block **)bp) != NULL) { // 如果bp的值不为NULL,则将freeblock指向一下空闲地址并返回当
UNLOCK();
if (use_calloc)
memset(bp, 0, nbytes); // 重置地址内容为0
return (void *)bp; // 返回该地址
}
...
}
此时就是将已经释放的block直接返回。此时一个pool的申请和释放的流程大致完成,当一个pool使用完后,就会重新初始化一个pool此时分析pool的相关内容。
arena
多个pool的集合就是一个arena,每个arena的大小定义如下;
#define ARENA_SIZE (256 << 10) /* 256KB */
#ifdef WITH_MEMORY_LIMITS
#define MAX_ARENAS (SMALL_MEMORY_LIMIT / ARENA_SIZE)
#endif
显而易见,一个arena最多可容纳64个4KB的pool,由此看下arena的定义,
/* Record keeping for arenas. */
struct arena_object {
/* The address of the arena, as returned by malloc. Note that 0
* will never be returned by a successful malloc, and is used
* here to mark an arena_object that doesn't correspond to an
* allocated arena.
*/
uptr address; // arena的头地址
/* Pool-aligned pointer to the next pool to be carved off. */
block* pool_address;
/* The number of available pools in the arena: free pools + never-
* allocated pools.
*/
uint nfreepools; // 空闲可使用的pool数
/* The total number of pools in the arena, whether or not available. */
uint ntotalpools; // 总共pool数
/* Singly-linked list of available pools. */
struct pool_header* freepools; // 单向的可用pool链表
/* Whenever this arena_object is not associated with an allocated
* arena, the nextarena member is used to link all unassociated
* arena_objects in the singly-linked `unused_arena_objects` list.
* The prevarena member is unused in this case.
*
* When this arena_object is associated with an allocated arena
* with at least one available pool, both members are used in the
* doubly-linked `usable_arenas` list, which is maintained in
* increasing order of `nfreepools` values.
*
* Else this arena_object is associated with an allocated arena
* all of whose pools are in use. `nextarena` and `prevarena`
* are both meaningless in this case.
*/
struct arena_object* nextarena; // 上一个arena
struct arena_object* prevarena; // 下一个arena
};
在运行过程中,arena的申请过程是通过new_arena函数来实现的,
/* Array of objects used to track chunks of memory (arenas). */
static struct arena_object* arenas = NULL; // 管理着所有arenas集合
/* Number of slots currently allocated in the `arenas` vector. */
static uint maxarenas = 0; // 当前arenas管理的arena_object的个数
/* The head of the singly-linked, NULL-terminated list of available
* arena_objects.
*/
static struct arena_object* unused_arena_objects = NULL; // 未使用的链表
/* The head of the doubly-linked, NULL-terminated at each end, list of
* arena_objects associated with arenas that have pools available.
*/
static struct arena_object* usable_arenas = NULL; // 可用的链表
/* How many arena_objects do we initially allocate?
* 16 = can allocate 16 arenas = 16 * ARENA_SIZE = 4MB before growing the
* `arenas` vector.
*/
#define INITIAL_ARENA_OBJECTS 16 // 初始化需要申请的arena_object的个数
/* Number of arenas allocated that haven't been free()'d. */
static size_t narenas_currently_allocated = 0; // 还未被释放的个数
/* Total number of times malloc() called to allocate an arena. */
static size_t ntimes_arena_allocated = 0; // 申请arena malloc的次数
/* High water mark (max value ever seen) for narenas_currently_allocated. */
static size_t narenas_highwater = 0;
static Py_ssize_t _Py_AllocatedBlocks = 0;
Py_ssize_t
_Py_GetAllocatedBlocks(void)
{
return _Py_AllocatedBlocks;
}
/* Allocate a new arena. If we run out of memory, return NULL. Else
* allocate a new arena, and return the address of an arena_object
* describing the new arena. It's expected that the caller will set
* `usable_arenas` to the return value.
*/
static struct arena_object*
new_arena(void)
{
struct arena_object* arenaobj;
uint excess; /* number of bytes above pool alignment */
void *address;
#ifdef PYMALLOC_DEBUG
if (Py_GETENV("PYTHONMALLOCSTATS"))
_PyObject_DebugMallocStats(stderr);
#endif
if (unused_arena_objects == NULL) { // 判断未使用的arena_object是否为空
uint i;
uint numarenas;
size_t nbytes;
/* Double the number of arena objects on each allocation.
* Note that it's possible for `numarenas` to overflow.
*/
numarenas = maxarenas ? maxarenas << 1 : INITIAL_ARENA_OBJECTS; // 确定本次需要申请的arena_object个数,乘以2
if (numarenas <= maxarenas)
return NULL; /* overflow */
#if SIZEOF_SIZE_T <= SIZEOF_INT
if (numarenas > PY_SIZE_MAX / sizeof(*arenas))
return NULL; /* overflow */
#endif
nbytes = numarenas * sizeof(*arenas); // 计算需要申请的内存大小
arenaobj = (struct arena_object *)PyMem_RawRealloc(arenas, nbytes); // 申请内存
if (arenaobj == NULL)
return NULL;
arenas = arenaobj;
/* We might need to fix pointers that were copied. However,
* new_arena only gets called when all the pages in the
* previous arenas are full. Thus, there are *no* pointers
* into the old array. Thus, we don't have to worry about
* invalid pointers. Just to be sure, some asserts:
*/
assert(usable_arenas == NULL);
assert(unused_arena_objects == NULL);
/* Put the new arenas on the unused_arena_objects list. */ // 将申请到的arena放入未使用的unused_arena_objects列表中
for (i = maxarenas; i < numarenas; ++i) {
arenas[i].address = 0; /* mark as unassociated */
arenas[i].nextarena = i < numarenas - 1 ?
&arenas[i+1] : NULL;
}
/* Update globals. */
unused_arena_objects = &arenas[maxarenas]; // 更新最大未使用的列表的地址
maxarenas = numarenas; // 更新最大的arenas数量
}
/* Take the next available arena object off the head of the list. */
assert(unused_arena_objects != NULL);
arenaobj = unused_arena_objects; // 从未使用的arena_object中申请一个
unused_arena_objects = arenaobj->nextarena; // 重置下一个arena位置
assert(arenaobj->address == 0);
address = _PyObject_Arena.alloc(_PyObject_Arena.ctx, ARENA_SIZE); // 申请256KB的内存空间
if (address == NULL) {
/* The allocation failed: return NULL after putting the
* arenaobj back.
*/
arenaobj->nextarena = unused_arena_objects;
unused_arena_objects = arenaobj;
return NULL;
}
arenaobj->address = (uptr)address; // 将申请到的空间地址赋值给arenaobj->address
++narenas_currently_allocated;
++ntimes_arena_allocated;
if (narenas_currently_allocated > narenas_highwater)
narenas_highwater = narenas_currently_allocated;
arenaobj->freepools = NULL; // 设置arenaobj的freepools为空
/* pool_address <- first pool-aligned address in the arena
nfreepools <- number of whole pools that fit after alignment */
arenaobj->pool_address = (block*)arenaobj->address; // 设置arenaobj的pool_address为申请到内存的起始地址
arenaobj->nfreepools = ARENA_SIZE / POOL_SIZE; // 设置当前可使用pool的数量
assert(POOL_SIZE * arenaobj->nfreepools == ARENA_SIZE);
excess = (uint)(arenaobj->address & POOL_SIZE_MASK); // 调整边界为系统的页边界
if (excess != 0) {
--arenaobj->nfreepools;
arenaobj->pool_address += POOL_SIZE - excess;
}
arenaobj->ntotalpools = arenaobj->nfreepools; // 设置总的可用pool数
return arenaobj; // 返回arenaobj
}
该函数主要是先检查是否还存在未使用的arena,如果存在则直接初始化内存使用,否则则重现调整arena的总的数量,然后将申请到的arena填入到未使用arena链表中,然后申请内存,并进行相关的初始化。详细的流程如代码注释所述。
内存池管理
由于内存中block的申请和销毁都是通过pool来实现的,由此在管理时给pool定义了三种状态,
used == partially used, neither empty nor full
At least one block in the pool is currently allocated, and at least one
block in the pool is not currently allocated (note this implies a pool
has room for at least two blocks).
This is a pool's initial state, as a pool is created only when malloc
needs space.
The pool holds blocks of a fixed size, and is in the circular list headed
at usedpools[i] (see above). It's linked to the other used pools of the
same size class via the pool_header's nextpool and prevpool members.
If all but one block is currently allocated, a malloc can cause a
transition to the full state. If all but one block is not currently
allocated, a free can cause a transition to the empty state.
full == all the pool's blocks are currently allocated
On transition to full, a pool is unlinked from its usedpools[] list.
It's not linked to from anything then anymore, and its nextpool and
prevpool members are meaningless until it transitions back to used.
A free of a block in a full pool puts the pool back in the used state.
Then it's linked in at the front of the appropriate usedpools[] list, so
that the next allocation for its size class will reuse the freed block.
empty == all the pool's blocks are currently available for allocation
On transition to empty, a pool is unlinked from its usedpools[] list,
and linked to the front of its arena_object's singly-linked freepools list,
via its nextpool member. The prevpool member has no meaning in this case.
Empty pools have no inherent size class: the next time a malloc finds
an empty list in usedpools[], it takes the first pool off of freepools.
If the size class needed happens to be the same as the size class the pool
last had, some pool initialization can be skipped.
used状态:pool中至少有一个block已经被使用,并且至少有一个block还未被使用;full状态:pool中所有的block都已经被使用;empty状态:pool中所有的block都未被使用,通过freepools构成一个链表。
Python内部通过使用usedpools控制处于used的pool,当申请内存时,就通过usedpools数组寻找一块可用的pool,从中分配一个block。usedpool定义如下,
#define PTA(x) ((poolp )((uchar *)&(usedpools[2*(x)]) - 2*sizeof(block *)))
#define PT(x) PTA(x), PTA(x)
static poolp usedpools[2 * ((NB_SMALL_SIZE_CLASSES + 7) / 8) * 8] = {
PT(0), PT(1), PT(2), PT(3), PT(4), PT(5), PT(6), PT(7)
#if NB_SMALL_SIZE_CLASSES > 8
, PT(8), PT(9), PT(10), PT(11), PT(12), PT(13), PT(14), PT(15)
#if NB_SMALL_SIZE_CLASSES > 16
, PT(16), PT(17), PT(18), PT(19), PT(20), PT(21), PT(22), PT(23)
#if NB_SMALL_SIZE_CLASSES > 24
, PT(24), PT(25), PT(26), PT(27), PT(28), PT(29), PT(30), PT(31)
#if NB_SMALL_SIZE_CLASSES > 32
, PT(32), PT(33), PT(34), PT(35), PT(36), PT(37), PT(38), PT(39)
#if NB_SMALL_SIZE_CLASSES > 40
, PT(40), PT(41), PT(42), PT(43), PT(44), PT(45), PT(46), PT(47)
#if NB_SMALL_SIZE_CLASSES > 48
, PT(48), PT(49), PT(50), PT(51), PT(52), PT(53), PT(54), PT(55)
#if NB_SMALL_SIZE_CLASSES > 56
, PT(56), PT(57), PT(58), PT(59), PT(60), PT(61), PT(62), PT(63)
#if NB_SMALL_SIZE_CLASSES > 64
#error "NB_SMALL_SIZE_CLASSES should be less than 64"
#endif /* NB_SMALL_SIZE_CLASSES > 64 */
#endif /* NB_SMALL_SIZE_CLASSES > 56 */
#endif /* NB_SMALL_SIZE_CLASSES > 48 */
#endif /* NB_SMALL_SIZE_CLASSES > 40 */
#endif /* NB_SMALL_SIZE_CLASSES > 32 */
#endif /* NB_SMALL_SIZE_CLASSES > 24 */
#endif /* NB_SMALL_SIZE_CLASSES > 16 */
#endif /* NB_SMALL_SIZE_CLASSES > 8 */
};
当申请28个字节时,Python通过size class index,在usedpools中寻找第3+3=6个元素,此时usedpools[6]处指向的是usedpools[4]的位置,在初始化的过程中,由于usedpools[4]的值是usedpools[6]的值减去2*sizeof(block *),由于此时poolp结构;
typedef struct pool_header *poolp;
此时usedpools[6]->nextpool的值应该等于usedpools[4]的值,但是由于在usedpools数组初始化的时候存入的值是减去了两个block * 地址,对于pool_header结构来说,此时就相当于把值指向了nextpool值,所以在申请内存时,指向如下代码时,
if ((nbytes - 1) < SMALL_REQUEST_THRESHOLD) {
LOCK();
/*
* Most frequent paths first
*/
size = (uint)(nbytes - 1) >> ALIGNMENT_SHIFT;
pool = usedpools[size + size];
if (pool != pool->nextpool) { // 判断是否有可用的pool
...
}
此时如果pool与pool->nextpool不相等则意味着此时就有可用的pool,否则就没有就需要初始化一个pool使用,此时在初始时,
init_pool:
/* Frontlink to used pools. */
next = usedpools[size + size]; /* == prev */ // 获取prev地址
pool->nextpool = next; // 设置pool->next为next
pool->prevpool = next;
next->nextpool = pool; // 设置pool的下一个值为pool
next->prevpool = pool;
此时,当相同的size class index进来的时候,此时的pool与pool->nextpool就不相等,就证明有可用的pool,为什么使用这么复杂的usedpools数组来保存已经使用的pool列表呢?
It's unclear why the usedpools setup is so convoluted. It could be to
minimize the amount of cache required to hold this heavily-referenced table
(which only *needs* the two interpool pointer members of a pool_header). OTOH,
referencing code has to remember to "double the index" and doing so isn't
free, usedpools[0] isn't a strictly legal pointer, and we're crucially relying
on that C doesn't insert any padding anywhere in a pool_header at or before
the prevpool member.
是为了提供执行的性能,所以才这样设计usedpools数组。
至此,有关内存申请方面的内容大致分析完成,以下为代码执行全过程;
static void *
_PyObject_Alloc(int use_calloc, void *ctx, size_t nelem, size_t elsize)
{
size_t nbytes;
block *bp;
poolp pool;
poolp next;
uint size;
_Py_AllocatedBlocks++;
assert(nelem <= PY_SSIZE_T_MAX / elsize);
nbytes = nelem * elsize;
#ifdef WITH_VALGRIND
if (UNLIKELY(running_on_valgrind == -1))
running_on_valgrind = RUNNING_ON_VALGRIND;
if (UNLIKELY(running_on_valgrind))
goto redirect;
#endif
if (nelem == 0 || elsize == 0)
goto redirect;
if ((nbytes - 1) < SMALL_REQUEST_THRESHOLD) {
LOCK();
/*
* Most frequent paths first
*/
size = (uint)(nbytes - 1) >> ALIGNMENT_SHIFT;
pool = usedpools[size + size];
if (pool != pool->nextpool) {
/*
* There is a used pool for this size class.
* Pick up the head block of its free list.
*/
++pool->ref.count;
bp = pool->freeblock; // 获取freeblock的地址
assert(bp != NULL);
if ((pool->freeblock = *(block **)bp) != NULL) { // 如果bp的值不为NULL,则将freeblock指向一下空闲地址并返回当前bp
UNLOCK();
if (use_calloc)
memset(bp, 0, nbytes); // 重置地址内容为0
return (void *)bp; // 返回该地址
}
/*
* Reached the end of the free list, try to extend it.
*/
if (pool->nextoffset <= pool->maxnextoffset) { // 如果有足够的空间
/* There is room for another block. */
pool->freeblock = (block*)pool +
pool->nextoffset; // 设置freeblock指向下一个空闲的地址
pool->nextoffset += INDEX2SIZE(size); // 调整nextoffset指向的地址
*(block **)(pool->freeblock) = NULL; // 设置freeblock为空
UNLOCK();
if (use_calloc)
memset(bp, 0, nbytes);
return (void *)bp;
}
/* Pool is full, unlink from used pools. */
next = pool->nextpool; // 此时pool已经不能再分配内存使用,此时就移除usedpools链接
pool = pool->prevpool;
next->prevpool = pool;
pool->nextpool = next;
UNLOCK();
if (use_calloc)
memset(bp, 0, nbytes);
return (void *)bp;
}
/* There isn't a pool of the right size class immediately
* available: use a free pool.
*/
if (usable_arenas == NULL) { // 如果可用的arenas为空
/* No arena has a free pool: allocate a new arena. */
#ifdef WITH_MEMORY_LIMITS
if (narenas_currently_allocated >= MAX_ARENAS) {
UNLOCK();
goto redirect;
}
#endif
usable_arenas = new_arena(); // 生成新的arenas
if (usable_arenas == NULL) {
UNLOCK();
goto redirect;
}
usable_arenas->nextarena =
usable_arenas->prevarena = NULL; // 将新生成的arenas加入链表中
}
assert(usable_arenas->address != 0);
/* Try to get a cached free pool. */
pool = usable_arenas->freepools; // 获取一个新的pool
if (pool != NULL) {
/* Unlink from cached pools. */
usable_arenas->freepools = pool->nextpool; // 获取指向的空pool
/* This arena already had the smallest nfreepools
* value, so decreasing nfreepools doesn't change
* that, and we don't need to rearrange the
* usable_arenas list. However, if the arena has
* become wholly allocated, we need to remove its
* arena_object from usable_arenas.
*/
--usable_arenas->nfreepools; // 调整可使用的pool数量
if (usable_arenas->nfreepools == 0) { // 如果当前的arena可用pool为空
/* Wholly allocated: remove. */
assert(usable_arenas->freepools == NULL);
assert(usable_arenas->nextarena == NULL ||
usable_arenas->nextarena->prevarena ==
usable_arenas);
usable_arenas = usable_arenas->nextarena; // 指向下一个可用的arenas
if (usable_arenas != NULL) {
usable_arenas->prevarena = NULL;
assert(usable_arenas->address != 0);
}
}
else {
/* nfreepools > 0: it must be that freepools
* isn't NULL, or that we haven't yet carved
* off all the arena's pools for the first
* time.
*/
assert(usable_arenas->freepools != NULL ||
usable_arenas->pool_address <=
(block*)usable_arenas->address +
ARENA_SIZE - POOL_SIZE); // 检查是否还能够下一个pool使用
}
init_pool:
/* Frontlink to used pools. */
next = usedpools[size + size]; /* == prev */ // 获取prev地址
pool->nextpool = next; // 设置pool->next为next
pool->prevpool = next;
next->nextpool = pool; // 设置pool的下一个值为pool
next->prevpool = pool;
pool->ref.count = 1;
if (pool->szidx == size) { // 如果空闲的pool的szidx刚好是需要的
/* Luckily, this pool last contained blocks
* of the same size class, so its header
* and free list are already initialized.
*/
bp = pool->freeblock; // 获取空的block
assert(bp != NULL);
pool->freeblock = *(block **)bp; // 设置下一个freeblock的地址
UNLOCK();
if (use_calloc)
memset(bp, 0, nbytes);
return (void *)bp; // 返回
}
/*
* Initialize the pool header, set up the free list to
* contain just the second block, and return the first
* block.
*/ // 注册pool_header信息
pool->szidx = size; // pool的size
size = INDEX2SIZE(size); // 将size class size转换成字节大小
bp = (block *)pool + POOL_OVERHEAD; // 跳过pool的头部
pool->nextoffset = POOL_OVERHEAD + (size << 1); // 跳过两个大小的size
pool->maxnextoffset = POOL_SIZE - size; // 设置最大的空余量
pool->freeblock = bp + size; // 可用内存指向下一个
*(block **)(pool->freeblock) = NULL; // 初始化完成设置freeblock为空
UNLOCK(); // 解锁
if (use_calloc)
memset(bp, 0, nbytes); // 将第一值对应的内存空间设置为0
return (void *)bp; // 返回地址
}
/* Carve off a new pool. */
assert(usable_arenas->nfreepools > 0);
assert(usable_arenas->freepools == NULL);
pool = (poolp)usable_arenas->pool_address; // 获取新的一个新的Pool
assert((block*)pool <= (block*)usable_arenas->address +
ARENA_SIZE - POOL_SIZE);
pool->arenaindex = (uint)(usable_arenas - arenas); // 设置arena数组中的序号
assert(&arenas[pool->arenaindex] == usable_arenas);
pool->szidx = DUMMY_SIZE_IDX;
usable_arenas->pool_address += POOL_SIZE; // 移动可用的起始地址,移动4KB大小
--usable_arenas->nfreepools; // 减少可用pools数量
if (usable_arenas->nfreepools == 0) {
assert(usable_arenas->nextarena == NULL ||
usable_arenas->nextarena->prevarena ==
usable_arenas);
/* Unlink the arena: it is completely allocated. */
usable_arenas = usable_arenas->nextarena;
if (usable_arenas != NULL) {
usable_arenas->prevarena = NULL;
assert(usable_arenas->address != 0);
}
}
goto init_pool;
}
/* The small block allocator ends here. */
redirect:
/* Redirect the original request to the underlying (libc) allocator.
* We jump here on bigger requests, on error in the code above (as a
* last chance to serve the request) or when the max memory limit
* has been reached.
*/
{
void *result; // 大于512字节的内存申请直接调用封装的malloc函数申请返回
if (use_calloc)
result = PyMem_RawCalloc(nelem, elsize);
else
result = PyMem_RawMalloc(nbytes);
if (!result)
_Py_AllocatedBlocks--;
return result;
}
}
申请的过程中,设计了对areans的判断和操作,检查usedpools是否有可用的pool,如果没有则在areans中申请一个新的pool,并将该pool加入usedpools中,方便下次执行时直接查找,当大于512字节的内存申请时则直接调用malloc申请内存并返回。
block的释放过程
该函数的过程代码如下;
ATTRIBUTE_NO_ADDRESS_SAFETY_ANALYSIS
static void
_PyObject_Free(void *ctx, void *p)
{
poolp pool;
block *lastfree;
poolp next, prev;
uint size;
#ifndef Py_USING_MEMORY_DEBUGGER
uint arenaindex_temp;
#endif
if (p == NULL) /* free(NULL) has no effect */
return;
_Py_AllocatedBlocks--;
#ifdef WITH_VALGRIND
if (UNLIKELY(running_on_valgrind > 0))
goto redirect;
#endif
pool = POOL_ADDR(p);
if (Py_ADDRESS_IN_RANGE(p, pool)) { // 判断p指向的是否属于pool
/* We allocated this address. */
LOCK();
/* Link p to the start of the pool's freeblock list. Since
* the pool had at least the p block outstanding, the pool
* wasn't empty (so it's already in a usedpools[] list, or
* was full and is in no list -- it's not in the freeblocks
* list in any case).
*/
assert(pool->ref.count > 0); /* else it was empty */
*(block **)p = lastfree = pool->freeblock; // 获取freeblock值将地址设置到p中存入
pool->freeblock = (block *)p; // freeblock指向被释放的地址
if (lastfree) { // 如果有效则表明pool不是full状态
struct arena_object* ao;
uint nf; /* ao->nfreepools */
/* freeblock wasn't NULL, so the pool wasn't full,
* and the pool is in a usedpools[] list.
*/
if (--pool->ref.count != 0) { // 如果不为0则不需要转换为empty状态
/* pool isn't empty: leave it in usedpools */
UNLOCK();
return;
}
/* Pool is now empty: unlink from usedpools, and
* link to the front of freepools. This ensures that
* previously freed pools will be allocated later
* (being not referenced, they are perhaps paged out).
*/
next = pool->nextpool; // 此时pool为empty,将pool从usedpool中移除
prev = pool->prevpool; // 获取前一个pool
next->prevpool = prev; // 设置next的上一个为prev
prev->nextpool = next; // 设置prev的下一个为next
// 将empty pool移除
/* Link the pool to freepools. This is a singly-linked
* list, and pool->prevpool isn't used there.
*/
ao = &arenas[pool->arenaindex]; // 获取pool对应的arenas
pool->nextpool = ao->freepools; // 将ao的空闲pools设置到pool的nextpool
ao->freepools = pool; // 设置ao当前空闲的pools为当前的pool
nf = ++ao->nfreepools; // 将ao对应的空闲pool数加1
/* All the rest is arena management. We just freed
* a pool, and there are 4 cases for arena mgmt:
* 1. If all the pools are free, return the arena to
* the system free().
* 2. If this is the only free pool in the arena,
* add the arena back to the `usable_arenas` list.
* 3. If the "next" arena has a smaller count of free
* pools, we have to "slide this arena right" to
* restore that usable_arenas is sorted in order of
* nfreepools.
* 4. Else there's nothing more to do.
*/
if (nf == ao->ntotalpools) { // 如果此时所有的pool全为空闲
/* Case 1. First unlink ao from usable_arenas.
*/
assert(ao->prevarena == NULL ||
ao->prevarena->address != 0);
assert(ao ->nextarena == NULL ||
ao->nextarena->address != 0);
/* Fix the pointer in the prevarena, or the
* usable_arenas pointer.
*/
if (ao->prevarena == NULL) { // 如果前一个arena为空
usable_arenas = ao->nextarena; // 将usable_arenas指向下一个arena
assert(usable_arenas == NULL ||
usable_arenas->address != 0);
}
else {
assert(ao->prevarena->nextarena == ao);
ao->prevarena->nextarena =
ao->nextarena; // 将当前的上一个的下一个arena设置成ao的下一个
}
/* Fix the pointer in the nextarena. */
if (ao->nextarena != NULL) { // 如果ao的下一个不为空
assert(ao->nextarena->prevarena == ao);
ao->nextarena->prevarena =
ao->prevarena; // 设置ao的下一个的上一个为ao的上一个
}
/* Record that this arena_object slot is
* available to be reused.
*/
ao->nextarena = unused_arena_objects; // 设置ao的下一个为不可用
unused_arena_objects = ao; // 将ao赋值给unused_arena_objects
/* Free the entire arena. */
_PyObject_Arena.free(_PyObject_Arena.ctx,
(void *)ao->address, ARENA_SIZE); // 释放相关内存
ao->address = 0; /* mark unassociated */
--narenas_currently_allocated;
UNLOCK();
return;
}
if (nf == 1) { // 如果此时是第一个使用的pool
/* Case 2. Put ao at the head of
* usable_arenas. Note that because
* ao->nfreepools was 0 before, ao isn't
* currently on the usable_arenas list.
*/
ao->nextarena = usable_arenas; // 设置ao的下一个为usable_arenas
ao->prevarena = NULL; // 设置ao的上一个为空
if (usable_arenas)
usable_arenas->prevarena = ao;
usable_arenas = ao;
assert(usable_arenas->address != 0);
UNLOCK();
return;
}
/* If this arena is now out of order, we need to keep
* the list sorted. The list is kept sorted so that
* the "most full" arenas are used first, which allows
* the nearly empty arenas to be completely freed. In
* a few un-scientific tests, it seems like this
* approach allowed a lot more memory to be freed.
*/
if (ao->nextarena == NULL ||
nf <= ao->nextarena->nfreepools) {
/* Case 4. Nothing to do. */
UNLOCK();
return;
}
/* Case 3: We have to move the arena towards the end
* of the list, because it has more free pools than
* the arena to its right.
* First unlink ao from usable_arenas.
*/
if (ao->prevarena != NULL) {
/* ao isn't at the head of the list */
assert(ao->prevarena->nextarena == ao);
ao->prevarena->nextarena = ao->nextarena; // 跨过ao
}
else {
/* ao is at the head of the list */
assert(usable_arenas == ao);
usable_arenas = ao->nextarena; // 设置ao的下一个arenas可用
}
ao->nextarena->prevarena = ao->prevarena; // 设置ao下一个的上一个为ao的上一个
/* Locate the new insertion point by iterating over
* the list, using our nextarena pointer.
*/
while (ao->nextarena != NULL &&
nf > ao->nextarena->nfreepools) {
ao->prevarena = ao->nextarena; // 设置ao的上一个为ao的下一个
ao->nextarena = ao->nextarena->nextarena; // 设置ao的下一个为ao的下一个的下一个
}
/* Insert ao at this point. */
assert(ao->nextarena == NULL ||
ao->prevarena == ao->nextarena->prevarena);
assert(ao->prevarena->nextarena == ao->nextarena);
ao->prevarena->nextarena = ao; // 将ao设置到ao的上一个的下一个
if (ao->nextarena != NULL)
ao->nextarena->prevarena = ao;
/* Verify that the swaps worked. */
assert(ao->nextarena == NULL ||
nf <= ao->nextarena->nfreepools);
assert(ao->prevarena == NULL ||
nf > ao->prevarena->nfreepools);
assert(ao->nextarena == NULL ||
ao->nextarena->prevarena == ao);
assert((usable_arenas == ao &&
ao->prevarena == NULL) ||
ao->prevarena->nextarena == ao);
UNLOCK();
return;
}
/* Pool was full, so doesn't currently live in any list:
* link it to the front of the appropriate usedpools[] list.
* This mimics LRU pool usage for new allocations and
* targets optimal filling when several pools contain
* blocks of the same size class.
*/
--pool->ref.count; // 引用减一
assert(pool->ref.count > 0); /* else the pool is empty */
size = pool->szidx; // 获取pool的size
next = usedpools[size + size]; // 获取usedpools的数组对应的值
prev = next->prevpool; // 获取上一个
/* insert pool before next: prev <-> pool <-> next */
pool->nextpool = next; // 设置pool的下一个为next
pool->prevpool = prev; // 设置pool的上一个为next的上一个
next->prevpool = pool; // 设置next的上一个为pool
prev->nextpool = pool; // 设置prev的下一个为pool ,将pool插入到了usedpools列表中
UNLOCK(); // 释放锁
return;
}
#ifdef WITH_VALGRIND
redirect:
#endif
/* We didn't allocate this address. */
PyMem_RawFree(p); // 释放内容
}
该函数的执行过程,实际上就是将block归还给pool,如果大于512字节的对象直接调用封装free函数释放该内存,否则将block归还给pool,但是由于设计到pool的状态的变化,根据不同的变化有不同的处理方式,在pool的状态保持为used状态这种情况下,Python仅仅将block重新放入到block列表中,然后重新设置pool可用的block数量,重置freeblock指针,如果当前的pool已经是full状态,此时在释放block的时候,需要将pool转换为used状态,并将该pool重新加入到usedpools的头部,当从used状态转换为empty状态的时候,需要将empty的pool加入到freepools链表中去,此时对应的相关pool的操作就完成了。
然后设计到的arena的相关操作,当当前的arena的所有的pool都为empty是,释放pool集合占用的内存;当之前的arena没有了empty的pool是,那么在unable_arenas链表中找不到arena,需要将该arena加入到usable_arenas链表中;当arena中的empty的pool个数为n,则从usable_arenas开始寻找arena可用插入的位置,将arena插入,这样保证一个arena的empty pool的数量越多,被使用的机会越少。
基本的操作情况如上所述,如有兴趣可自行查阅相关代码。
总结
Python的内存的管理机制相对而言,直接操作的都是pool对象,根据pool对象的状态来管理相关arena的状态,以此达到控制。其中相对晦涩难懂的可能就是对usedpools数组使用的理解,随后附上相关参考博文。
usedpools数组