2006-8-5 mm/swap.c
这个文件内容比较凌乱,大部分的内容在分析memory.c, filemap.c的时候都有涉及.这里就给个大概的描述吧.以说明我们也搞过这个文件了.
仔细阅读一下注释,freepages的含义就一目了然了: freepages.min :当系统的空闲页面(buddy中)少于此数量时,只有kernel有权利使用这些页面. freepages.low :当系统空闲页面低于此值时,内核开始大规模启动swap . freepages.high :我们努力保持系统的空闲页面不低于此值. 如果低于此值内核 即刻开始逐渐swap out,以使内核'永不'进行较大规模的swap.
/* * We identify three levels of free memory. We never let free mem * fall below the freepages.min except for atomic allocations. We * start background swapping if we fall below freepages.high free * pages, and we begin intensive swapping below freepages.low. * * Actual initialization is done in mm/page_alloc.c or * arch/sparc(64)/mm/init.c. */ freepages_t freepages = { 0, /* freepages.min */ /*When the number of free pages in the system reaches this number, only the kernel can allocate more memory. */
0, /* freepages.low *//*If the number of free pages gets below this point, the kernel starts swapping aggressively.*/
0 /* freepages.high *//*The kernel tries to keep up to this amount of memory free; if memory comes below this point, the kernel gently starts swapping in the hopes that it never has to do real aggressive swapping.*/ };
关于alloc page的详细情况,见filemap.c的分析. 下面是page_cluster和memory_pressure:
/* How many pages do we try to swap or page in/out together? */ int page_cluster; /*每个cluster拥有的页面个数,2^page_cluster*/ /* page_cluster: 用于filemap, swap in 的预读簇 SWAPFILE_CLUSTER: swap 设备分配swap entry(swap page)使用的簇 */
/* * This variable contains the amount of page steals the system * is doing, averaged over a minute. We use this to determine how * many inactive pages we should have. * * In reclaim_page and __alloc_pages: memory_pressure++ * In __free_pages_ok: memory_pressure-- * In recalculate_vm_stats the value is decayed (once a second) */ int memory_pressure;
也请看filemap.c的相关分析.
然后是 nr_async_pages: /* We track the number of pages currently being asynchronously swapped * out, so that we don't try to swap TOO many pages out at once */ /* rw_swap_page_base: inc nr_async_pages end_buffer_io_async:dec this one*/ /*异步io 状态就是提交读/写后不等待页面get unlocked*/ atomic_t nr_async_pages = ATOMIC_INIT(0); /*所有在异步io状态的页面总数*/
buffer_mem, page_cache 几乎被放弃了,2.6的实现较为利落.不再分析.
在pager_daemon中只有pager_daemon.swap_cluster 还有意义,其他值都没有地方使用. pager_daemon.swap_cluster; /*最多预读的cluster个数*/ 面分析的page_cluster,是每个cluster的页面数量,这里的swap_cluster是一次最多容许swap的cluster数量.
剩下的函数实现操作page->age,以及在lru cache内移动page.filemap.c的分析中解释了lru cache的组成.请见相关分析. 这里申明一下PG_referenced的具体含义:
/* -- user page 在try_to_swap_out 通过遍历进程页 * 目录,操作page->age. 而对于内核使用的页 * 面,比如buffer,inode等,try_to_swap_out无法确定 * 是否应该老化此页面. 解决的办法就是 * 通过PG_referenced bit,内核访问的时候置此位 * 相当于hit. * mm.h 的注释如下 * For choosing which pages to swap out, inode pages carry a * PG_referenced bit, which is set any time the system accesses * that page through the (inode,offset) hash table. */
剩余的函数逻辑都是清楚的,理解lru,page cache,swap cache后不难. 不再列出.
|