最近使用zookeeper发现, 首次启动zookeeper时, 都会遇到一个错误:
$ bin/zkServer.sh start
JMX enabled by default
Using config: /home/nauhcud/workspace/zookeeper/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... bin/zkServer.sh: 162: cannot create /tmp/zookeeper/zookeeper_server.pid: Directory nonexistent
FAILED TO WRITE PID
查看进程发现zookeeper进程存在, 并可以正常使用. dataDir "/tmp/zookeeper/" 也在.
kill掉zookeeper进程重新启动, 一切正常.
删掉dataDir, 问题依旧.
查看启动脚本, 发现start逻辑如下:
start)
...
nohup $JAVA "-Dzookeeper.log.dir=${ZOO_LOG_DIR}" "-Dzookeeper.root.logger=${ZOO_LOG4J_PROP}" \
-cp "$CLASSPATH" $JVMFLAGS $ZOOMAIN "$ZOOCFG" > "$_ZOO_DAEMON_OUT" 2>&1 < /dev/null &
if [ $? -eq 0 ]
then
if /bin/echo -n $! > "$ZOOPIDFILE"
then
sleep 1
echo STARTED
else
echo FAILED TO WRITE PID
#这句说明写入Pid出现了问题
exit 1
fi
...
纵观整个脚本, dataDir没有出现, 因此, dataDir应该是zookeeper进程内部建立的, 并且有一定延迟, 因此将zookeeper进程id写入到dataDir下的pidfile时, dataDir还没有建立好, 因此就出现了上述情况.
解决办法很简单, 在写入pid之前先判断一下datadir是否存在, 让zookeeper有时间做完初始化, 然后再将pid写入即可.
start)
...
nohup $JAVA "-Dzookeeper.log.dir=${ZOO_LOG_DIR}" "-Dzookeeper.root.logger=${ZOO_LOG4J_PROP}" \
-cp "$CLASSPATH" $JVMFLAGS $ZOOMAIN "$ZOOCFG" > "$_ZOO_DAEMON_OUT" 2>&1 < /dev/null &
zkpid=$!;
if [ $? -eq 0 ]
then
while [ ! -d `dirname $ZOOPIDFILE` ]
do
sleep 1;
done
if /bin/echo -n $zkpid > "$ZOOPIDFILE"
then
sleep 1
echo STARTED
else
echo FAILED TO WRITE PID
exit 1
fi
...