1. 必要性

Hadoop提供了多个配置参数使得admin和user可以灵活设定内存;有些参数有defaut-value, 有些选项是cluster specific以支持memory-intensive作业。

当构建一个cluster时,admin可以先设定一些appropriate default value;其他一些参数设定可根据cluster硬件配置(如任务可获得的物理内存和虚拟内存的总大小、slave配置的slots的数目、在slave上运行的process的需求)和作业类型(如内存密集型任务)而确定。

2. 内存监控

(1) 监控任务内存的目的

防止MapReduce task占用了过量的内存(consuming memory beyond a limit),从而导致同在该slave上运行的其他进程、其他任务、或者daemon(例如DataNode或者TaskTracker)。

(2) virtual memory和physical memory

Hadoop可以监控节点的virtual memory和physical memory,两者之间独立。然而,在streaming应用中,由于程序需要加载了libraries来执行任务,故virtual memory使用较多。在这种情况下,监控physical memory会更准确.

(3) hadoop允许为作业指定期望所需内存的最大值。通过resource aware scheduling and monitoring, hadoop tries to确保满足task数量,以满足限制

(a) an individual job's memory requirement  

(b) the total amount of memory available for all MapReduce tasks

(4) TaskTracker 对task的监控

(a) 周期性的监控

第一步:以防某个task及其child process累计使用的virtual memory和physical memory的量不超过specified的量。先查virtual memory, 接着physical memory. 若超过,则kill该task及其child process。并标记该task为failed.

第二步:检查某个job的所有running tasks及其child processes累计使用的virtual memory和physical memory的量。若超过limit, 则kill以足够量的task,直到累计内存的使用量低于limit. (若virtual memory超限,则kill掉那些进展最小的tasks;若physical memory超限,则kill掉那些占用physical memory最多的task)。被kill掉的task被标记为killed.

(5) Resource aware scheduling

Resource aware scheduling能确保:要调度task到某个slave上前,先要确保该slave能够满足task的memory requirement。

Capacity Scheduling在调度作业时,把virtual memory的需求考虑进去。

(6) Configuring Memory Requirements For A Job    在MapReduce的tutorial中

(7) cluster相关的内存配置  

这些配置与JobTracker和TaskTracker相关,任何job不能修改这些参数。另外,配置参数在每个slave上相同。


  • mapreduce.cluster.{map|reduce}memory.mb: These options define the default amount of virtual memory that should be allocated for MapReduce tasks running in the cluster. They typically match the default values set for the options mapreduce.{map|reduce}.memory.mb. They help in the calculation of the total amount of virtual memory available for MapReduce tasks on a slave, using the following equation:
    Total virtual memory for all MapReduce tasks = (mapreduce.cluster.mapmemory.mb * mapreduce.tasktracker.map.tasks.maximum) + (mapreduce.cluster.reducememory.mb * mapreduce.tasktracker.reduce.tasks.maximum)
    Typically, reduce tasks require more memory than map tasks. Hence a higher value is recommended for mapreduce.cluster.reducememory.mb. The value is specified in MB. To set a value of 2GB for reduce tasks, set mapreduce.cluster.reducememory.mb to 2048.

  • mapreduce.jobtracker.max{map|reduce}memory.mb: These options define the maximum amount of virtual memory that can be requested by jobs using the parameters mapreduce.{map|reduce}.memory.mb. The system will reject any job that is submitted requesting for more memory than these limits. Typically, the values for these options should be set to satisfy the following constraint:
    mapreduce.jobtracker.maxmapmemory.mb = mapreduce.cluster.mapmemory.mb * mapreduce.tasktracker.map.tasks.maximum
    mapreduce.jobtracker.maxreducememory.mb = mapreduce.cluster.reducememory.mb * mapreduce.tasktracker.reduce.tasks.maximum

    The value is specified in MB. If mapreduce.cluster.reducememory.mb is set to 2GB and there are 2 reduce slots configured in the slaves, the value formapreduce.jobtracker.maxreducememory.mb should be set to 4096.

  • mapreduce.tasktracker.reserved.physicalmemory.mb: This option defines the amount of physical memory that is marked for system and daemon processes. Using this, the amount of physical memory available for MapReduce tasks is calculated using the following equation:
    Total physical memory for all MapReduce tasks = Total physical memory available on the system - mapreduce.tasktracker.reserved.physicalmemory.mb
    The value is specified in MB. To set this value to 2GB, specify the value as 2048.

  • mapreduce.tasktracker.taskmemorymanager.monitoringinterval: This option defines the time the TaskTracker waits between two cycles of memory monitoring. The value is specified in milliseconds.


Note: The virtual memory monitoring function is only enabled if the variables mapreduce.cluster.{map|reduce}memory.mb andmapreduce.jobtracker.max{map|reduce}memory.mb are set to values greater than zero. Likewise, the physical memory monitoring function is only enabled if the variable mapreduce.tasktracker.reserved.physicalmemory.mb is set to a value greater than zero.

转自http://blog.csdn.net/amaowolf/article/details/7188504