在工作过程中,经常需要调整一些hadoop的参数配置,所以经常会遇到各种各样的问题。比如改了个配置怎么突然namenode起不来啦,加了个jar包怎么让hadoop的jvm加载啊,如何设定log目录啦等等,每次都需要仔细的查一遍启动脚本才能找到原因,费时又费力,因此专门总结了一下以便不时之需。


cloudera的hadoop的启动脚本写的异常复杂和零散,各种shell脚本分散在系统的各个角落,让人很无语。下面以namenode启动的过程为例说明hadoop的启动脚本的调用关系和各个脚本的作用。


hadoop启动的入口脚本是/etc/init.d/hadoop-hdfs-name,下面我们顺着启动namenode的顺序看看hadoop的启动调用过程。

/etc/init.d/hadoop-hdfs-namenode:
#1.加载/etc/default/hadoop /etc/default/hadoop-hdfs-namenode

#2.执行/usr/lib/hadoop/sbin/hadoop-daemon.sh启动namenode

cloudera启动namenode的用户为hdfs,默认的配置目录是/etc/hadoop/conf



[plain]  view plain copy


1. start() {  
2.   [ -x $EXEC_PATH ] || exit $ERROR_PROGRAM_NOT_INSTALLED  
3.   [ -d $CONF_DIR ] || exit $ERROR_PROGRAM_NOT_CONFIGURED  
4.   log_success_msg "Starting ${DESC}: "  
5.   
6.   su -s /bin/bash $SVC_USER -c "$EXEC_PATH --config '$CONF_DIR' start $DAEMON_FLAGS"  
7.   
8.   # Some processes are slow to start  
9.   sleep $SLEEP_TIME  
10.   checkstatusofproc  
11.   RETVAL=$?  
12.   
13.   [ $RETVAL -eq $RETVAL_SUCCESS ] && touch $LOCKFILE  
14.   return $RETVAL  
15. }

 

/etc/default/hadoop  /etc/default/hadoop-hdfs-namenode:
#1.配置logdir,piddir,user

/usr/lib/hadoop/sbin/hadoop-daemon.sh
 #1.加载/usr/lib/hadoop/libexec/hadoop-config.sh



[plain]  view plain copy

1. DEFAULT_LIBEXEC_DIR="$bin"/../libexec  
2. HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}  
3. . $HADOOP_LIBEXEC_DIR/hadoop-config.sh

#2.加载hadoop-env.sh



[plain]  view plain copy


1. if [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then  
2.   . "${HADOOP_CONF_DIR}/hadoop-env.sh"  
3. fi


#3.指定log目录



[plain]  view plain copy


1. # get log directory  
2. if [ "$HADOOP_LOG_DIR" = "" ]; then  
3.   export HADOOP_LOG_DIR="$HADOOP_PREFIX/logs"  
4. fi


#4.补全log目录和log4j的logger等参数



[plain]  view plain copy


1. export HADOOP_LOGFILE=hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.log  
2. export HADOOP_ROOT_LOGGER=${HADOOP_ROOT_LOGGER:-"INFO,RFA"}  
3. export HADOOP_SECURITY_LOGGER=${HADOOP_SECURITY_LOGGER:-"INFO,RFAS"}  
4. export HDFS_AUDIT_LOGGER=${HDFS_AUDIT_LOGGER:-"INFO,NullAppender"}  
5. log=$HADOOP_LOG_DIR/hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.out  
6. pid=$HADOOP_PID_DIR/hadoop-$HADOOP_IDENT_STRING-$command.pid  
7. HADOOP_STOP_TIMEOUT=${HADOOP_STOP_TIMEOUT:-5}

#5.调用/usr/lib/hadoop-hdfs/bin/hdfs


[plain]  view plain copy



1. hadoop_rotate_log $log  
2. echo starting $command, logging to $log  
3. cd "$HADOOP_PREFIX"  
4. case $command in  
5.   namenode|secondarynamenode|datanode|journalnode|dfs|dfsadmin|fsck|balancer|zkfc)  
6.  if [ -z "$HADOOP_HDFS_HOME" ]; then  
7.    hdfsScript="$HADOOP_PREFIX"/bin/hdfs  
8.  else  
9.    hdfsScript="$HADOOP_HDFS_HOME"/bin/hdfs  
10.  fi  
11.  nohup nice -n $HADOOP_NICENESS $hdfsScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null &  
12.   ;;  
13.   (*)  
14.  nohup nice -n $HADOOP_NICENESS $hadoopScript --config $HADOOP_CONF_DIR $command "$@" > "$log" 2>&1 < /dev/null &  
15.   ;;  
16. esac  
17. echo $! > $pid  
18. sleep 1; head "$log"  
19. sleep 3;  
20. if ! ps -p $! > /dev/null ; then  
21.   exit 1  
22. fi


可以看到namenode的sysout输出到$log中,即log=$HADOOP_LOG_DIR/hadoop-$HADOOP_IDENT_STRING-$command-$HOSTNAME.out

/usr/lib/hadoop/libexec/hadoop-config.sh
#1.加载/usr/lib/hadoop/libexec/hadoop-layout.sh
 hadoop-layout.sh主要描述了hadoop的lib的文件夹结构,主要内容如下



[plain]  view plain copy



    1. HADOOP_COMMON_DIR="./"  
    2. HADOOP_COMMON_LIB_JARS_DIR="lib"  
    3. HADOOP_COMMON_LIB_NATIVE_DIR="lib/native"  
    4. HDFS_DIR="./"  
    5. HDFS_LIB_JARS_DIR="lib"  
    6. YARN_DIR="./"  
    7. YARN_LIB_JARS_DIR="lib"  
    8. MAPRED_DIR="./"  
    9. MAPRED_LIB_JARS_DIR="lib"  
    10.   
    11. HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-"/usr/lib/hadoop/libexec"}  
    12. HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop/conf"}  
    13. HADOOP_COMMON_HOME=${HADOOP_COMMON_HOME:-"/usr/lib/hadoop"}  
    14. HADOOP_HDFS_HOME=${HADOOP_HDFS_HOME:-"/usr/lib/hadoop-hdfs"}  
    15. HADOOP_MAPRED_HOME=${HADOOP_MAPRED_HOME:-"/usr/lib/hadoop-0.20-mapreduce"}  
    16. YARN_HOME=${YARN_HOME:-"/usr/lib/hadoop-yarn"}


    #2.指定HDFS和YARN的lib



    [plain]  view plain copy


    1. HADOOP_COMMON_DIR=${HADOOP_COMMON_DIR:-"share/hadoop/common"}  
    2. HADOOP_COMMON_LIB_JARS_DIR=${HADOOP_COMMON_LIB_JARS_DIR:-"share/hadoop/common/lib"}  
    3. HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_COMMON_LIB_NATIVE_DIR:-"lib/native"}  
    4. HDFS_DIR=${HDFS_DIR:-"share/hadoop/hdfs"}  
    5. HDFS_LIB_JARS_DIR=${HDFS_LIB_JARS_DIR:-"share/hadoop/hdfs/lib"}  
    6. YARN_DIR=${YARN_DIR:-"share/hadoop/yarn"}  
    7. YARN_LIB_JARS_DIR=${YARN_LIB_JARS_DIR:-"share/hadoop/yarn/lib"}  
    8. MAPRED_DIR=${MAPRED_DIR:-"share/hadoop/mapreduce"}  
    9. MAPRED_LIB_JARS_DIR=${MAPRED_LIB_JARS_DIR:-"share/hadoop/mapreduce/lib"}  
    10.   
    11. # the root of the Hadoop installation  
    12. # See HADOOP-6255 for directory structure layout  
    13. HADOOP_DEFAULT_PREFIX=$(cd -P -- "$common_bin"/.. && pwd -P)  
    14. HADOOP_PREFIX=${HADOOP_PREFIX:-$HADOOP_DEFAULT_PREFIX}  
    15. export HADOOP_PREFIX

    #3.对slave文件判断。但cdh的hadoop不是依靠slave来启动集群的,而是要用户自己写集群启动脚本(也许是为了逼用户用他的CloudManager。。。)

    #4.再次指定env文件



    [plain]  view plain copy


    1. if [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then  
    2.   . "${HADOOP_CONF_DIR}/hadoop-env.sh"  
    3. fi

    #5.指定java home



    [plain]  view plain copy


    1. # Attempt to set JAVA_HOME if it is not set  
    2. if [[ -z $JAVA_HOME ]]; then  
    3.   # On OSX use java_home (or /Library for older versions)  
    4.   if [ "Darwin" == "$(uname -s)" ]; then  
    5.     if [ -x /usr/libexec/java_home ]; then  
    6.       export JAVA_HOME=($(/usr/libexec/java_home))  
    7.     else  
    8.       export JAVA_HOME=(/Library/Java/Home)  
    9.     fi  
    10.   fi  
    11.   
    12.   # Bail if we did not detect it  
    13.   if [[ -z $JAVA_HOME ]]; then  
    14.     echo "Error: JAVA_HOME is not set and could not be found." 1>&2  
    15.     exit 1  
    16.   fi  
    17. fi


    #6.指定Java程序启动的heapsize,如果用户在hadoop-env.sh中指定了HADOOP_HEAPSIZE字段则会覆盖默认值1000m



    [plain]  view plain copy



      1. # some Java parameters  
      2. JAVA_HEAP_MAX=-Xmx1000m  
      3.   
      4. # check envvars which might override default args  
      5. if [ "$HADOOP_HEAPSIZE" != "" ]; then  
      6.   #echo "run with heapsize $HADOOP_HEAPSIZE"  
      7.   JAVA_HEAP_MAX="-Xmx""$HADOOP_HEAPSIZE""m"  
      8.   #echo $JAVA_HEAP_MAX  
      9. fi


      #7.指定程序的classpath,一大串代码,总结下就是

      HADOOP_CONF_DIR+HADOOP_CLASSPATH+HADOOP_COMMON_DIR+HADOOP_COMMON_LIB_JARS_DIR+
       HADOOP_COMMON_LIB_JARS_DIR+HADOOP_COMMON_LIB_NATIVE_DIR+HDFS_DIR+HDFS_LIB_JARS_DIR
       +YARN_DIR+YARN_LIB_JARS_DIR+MAPRED_DIR+MAPRED_LIB_JARS_DIR

      有一个要注意的,hadoop比较贴心的提供了HADOOP_USER_CLASSPATH_FIRST属性,如何设置了,
      则HADOOP_CLASSPATH(用户自定义classpath)会在hadoop自身的jar包前加载,用来解决用户
      想最先加载自定义的jar包情况。

      #8.指定HADOOP_OPTS,-Dhadoop.log.dir这些类似参数会在conf下的log4j配置中用到



      [plain]  view plain copy


      1. HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR"  
      2. HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.file=$HADOOP_LOGFILE"  
      3. HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.home.dir=$HADOOP_PREFIX"  
      4. HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING"  
      5. HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_ROOT_LOGGER:-INFO,console}"  
      6. if [ "x$JAVA_LIBRARY_PATH" != "x" ]; then  
      7.   HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$JAVA_LIBRARY_PATH"  
      8.   export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$JAVA_LIBRARY_PATH  
      9. fi  
      10. HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.policy.file=$HADOOP_POLICYFILE"  
      11.   
      12. # Disable ipv6 as it can cause issues  
      13. HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"  
      14.   
      15. <span style="font-size:18px;">  
      16. </span>

      /usr/lib/hadoop-hdfs/bin/hdfs
      #1.加载/usr/lib/hadoop/libexec/hdfs-config.sh,但好像没啥作用

      #2.根据启动参数指定java的启动mainclass:


      [plain]  view plain copy


      1. if [ "$COMMAND" = "namenode" ] ; then  
      2.   CLASS='org.apache.hadoop.hdfs.server.namenode.NameNode'  
      3.   HADOOP_OPTS="$HADOOP_OPTS $HADOOP_NAMENODE_OPTS"  
      4.

      [plain]  view plain copy


      1. #3.启动Java程序  
      2. exec "$JAVA" -Dproc_$COMMAND $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"

      [html]  view plain copy



       

       

      最后介绍几个配置的小例子。

      1.如何指定hadoop的log目录:

      从启动脚本中看几个配置的优先级排序是hadoop-env.sh>hadoop-config.sh>/etc/default/hadoop,因此我们如果想指定hadoop的log目录只需在hadoop-env.sh中添加一行:

      export HADOOP_LOG_DIR=xxxxx

      2.如何添加自己的jar包到hadoop中被namenode,datanode使用

       export HADOOP_CLASSPATH=xxxxx

      3.如何单独设定namenode的java heapsize。

      比如想设置namenode10G,datanode1G,这个就有点意思了。如果直接指定HADOOP_HEAPSIZE那么此参数会作用于namenode,datanode,而单独在namenode的参数中指定也会有点小问题哦,不过基本是可以使用的。


      总之,由于hadoop的启动脚本极其多而且琐碎,再加上hbase hive的启动脚本都是类似的结构,导致在添加修改一些配置时会产生很多莫名的问题,大家也可以在使用的过程中细细体会啦