前言

如果读者是刚接触大数据,那么在阅读本章前,请先阅读下上一篇文章《前戏》会比较容易理解。本章主要讲解如何配置HDFS。

编辑core-site.xml

我编辑的路径是在/home/kfk/hadoop-2.8.0/etc/hadoop/core-site.xml


fs.defaultFS
hdfs://bigdata-pro01.kfk.com:9050


hadoop.tmp.dir
/home/kfk/hadoop-2.8.0/data/tmp

fs.defalultFS

这个是用来配置是否配置为分布式的文件系统。hdfs://用来指明是分布式系统,bigdata-pro01.kfk.com是主机名,9050是端口号,默认是9000,但由于9000和我的python使用的端口冲突了,所以我将9000改成了9050。使用端口的原则不与其他进程冲突。

hadoop.tmp.dir

这个配置临时目录,建议是放到kfk用户的目录下。

格式化

/home/kfk/hadoop-2.8.0/bin/hdfs namenode -format

namenode启动

/home/kfk/hadoop-2.8.0/sbin/hadoop-daemon.sh start namenode

datanode启动

/home/kfk/hadoop-2.8.0/sbin/hadoop-daemon.sh start datanode

jps 校验

jps是java process state检查工具,当看到DataNode,NameNode时,就说明正常启动了。
如:
[kfk@bigdata-pro01 sbin]$ jps
30963 DataNode
30134 NameNode
19742 Jps

网址访问NameNode

大数据实战之HDFS单机配置_apache

查看DataNode

大数据实战之HDFS单机配置_hdfs_02

遇到的稀奇古怪问题

NameNode启动失败

经过检查分析是由于配置的9000端口被python进程使用,导致NameNode无法启动。

DataNode启动失败。

************************************************************/
2022-11-11 17:44:41,482 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT]
2022-11-11 17:44:42,001 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2022-11-11 17:44:42,073 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
2022-11-11 17:44:42,073 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2022-11-11 17:44:42,078 INFO org.apache.hadoop.hdfs.server.datanode.BlockScanner: Initialized block scanner with targetBytesPerSec 1048576
2022-11-11 17:44:42,078 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Configured hostname is bigdata-pro01.kfk.com
2022-11-11 17:44:42,081 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting DataNode with maxLockedMemory = 0
2022-11-11 17:44:42,096 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened streaming server at /0.0.0.0:50010
2022-11-11 17:44:42,097 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 10485760 bytes/s
2022-11-11 17:44:42,098 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Number threads for balancing is 50
2022-11-11 17:44:42,193 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2022-11-11 17:44:42,198 INFO org.apache.hadoop.security.authentication.server.AuthenticationFilter: Unable to initialize FileSignerSecretProvider, falling back to use random secrets.
2022-11-11 17:44:42,202 INFO org.apache.hadoop.http.HttpRequestLog: Http request log for http.requests.datanode is not defined
2022-11-11 17:44:42,206 INFO org.apache.hadoop.http.HttpServer2: Added global filter ‘safety’ (class=org.apache.hadoop.http.HttpServer2StaticUserFilter) to context datanode
2022-11-11 17:44:42,207 INFO org.apache.hadoop.http.HttpServer2: Added filter static_user_filter (class=org.apache.hadoop.http.lib.StaticUserWebFilterStaticUserFilter) to context static
2022-11-11 17:44:42,455 INFO org.apache.hadoop.http.HttpServer2: HttpServer.start() threw a non Bind IOException
java.net.BindException: Port in use: localhost:0
at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:995)
at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:932)
at org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.(DatanodeHttpServer.java:131)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:905)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1295)
at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:481)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2601)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2489)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2536)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2721)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2745)
Caused by: java.net.BindException: 无法指定被请求的地址
at java.base/sun.nio.ch.Net.bind0(Native Method)
at java.base/sun.nio.ch.Net.bind(Net.java:555)
at java.base/sun.nio.ch.ServerSocketChannelImpl.netBind(ServerSocketChannelImpl.java:337)
at java.base/sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:294)
at java.base/sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:89)
at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:990)
… 10 more
2022-11-11 17:44:42,465 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Shutdown complete.
2022-11-11 17:44:42,465 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
java.net.BindException: Port in use: localhost:0
at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:995)
at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:932)
at org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.(DatanodeHttpServer.java:131)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:905)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1295)
at org.apache.hadoop.hdfs.server.datanode.DataNode.(DataNode.java:481)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2601)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2489)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2536)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2721)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2745)
Caused by: java.net.BindException: 无法指定被请求的地址
at java.base/sun.nio.ch.Net.bind0(Native Method)
at java.base/sun.nio.ch.Net.bind(Net.java:555)
at java.base/sun.nio.ch.ServerSocketChannelImpl.netBind(ServerSocketChannelImpl.java:337)
at java.base/sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:294)
at java.base/sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:89)
at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:990)
… 10 more
2022-11-11 17:44:42,469 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2022-11-11 17:44:42,472 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:

这个问题困扰了我两天,从网上搜了很多资料,大都是说是因为使用hdfs重复format,将data删除就可以了。但都是大都没什么用。我仔细看了下datanode日志,其中一句话吸引了我的注意。“java.net.BindException: Port in use: localhost:0”。我将/etc/hosts中的localhost部分给注释掉了,我想是不是因为注释掉了localhost,导致其datanode无法用localhost进行绑定呢。于是我将/etc/hosts中的localhost部分的注释取消。重新启动DataNode,完美,DataNode启动了。