1、检测集群启动进程
- 1.jps #查看启动进程
2.利用图形界面
hdfs: http://hadoop01:50070 #hadoop01表示主节点的的主机映射名
yarn: http://hadoop02:8088 #hadoop02表示配置yarn的ResourceManager的从节点主机映射名
3.运行任务测试
2、启动/关闭集群
- 1.逐个进程启动
hadoop-daemon.sh start(stop) namenode/datanode/secondarynamenode
yarn-daemon.sh start(stop) ResourceManager/NodeManager
2.整个集群启动(涉及到通信,ssh免密登录)
start-dfs.sh(stop-dfs.sh) #在主节点启动
start-yarn.sh(stop-yarn.sh) #必须在rm节点启动(所以这里在hadoop02里启动)
3.全部启动
start-all.sh(stop-all.sh)
3、shell操作
- hadoop fs #运行fs的命令
- hdfs dfs #运行fs的命令 等价于hadoop fs
[hadoop@hadoop02 ~]$ hadoop
Usage: hadoop [--config confdir] [COMMAND | CLASSNAME]
CLASSNAME run the class named CLASSNAME
or
where COMMAND is one of:
fs run a generic filesystem user client
version print the version
jar <jar> run a jar file
note: please use "yarn jar" to launch
YARN applications, not this command.
checknative [-a|-h] check native hadoop and compression libraries availability
distcp <srcurl> <desturl> copy file or directories recursively
archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
classpath prints the class path needed to get the
credential interact with credential providers
Hadoop jar and the required libraries
daemonlog get/set the log level for each daemon
trace view and modify Hadoop tracing settings
Most commands print help when invoked w/o parameters.
注:当你不清楚命令有哪些用途时直接输入第一个命令如hdfs回车,按照提示说明一步一步往下。如:找到dfs:在Hadoop支持的文件系统上运行文件系统命令;然后再执行hdfs dfs 再查看下一步操作及所对应的意思。
[hadoop@hadoop02 ~]$ hdfs
Usage: hdfs [--config confdir] [--loglevel loglevel] COMMAND
where COMMAND is one of:
dfs run a filesystem command on the file systems supported in Hadoop.
classpath prints the classpath
namenode -format format the DFS filesystem
secondarynamenode run the DFS secondary namenode
namenode run the DFS namenode
journalnode run the DFS journalnode
zkfc run the ZK Failover Controller daemon
datanode run a DFS datanode
dfsadmin run a DFS admin client
haadmin run a DFS HA admin client
fsck run a DFS filesystem checking utility
balancer run a cluster balancing utility
jmxget get JMX exported values from NameNode or DataNode.
mover run a utility to move block replicas across
storage types
oiv apply the offline fsimage viewer to an fsimage
oiv_legacy apply the offline fsimage viewer to an legacy fsimage
oev apply the offline edits viewer to an edits file
fetchdt fetch a delegation token from the NameNode
getconf get config values from configuration
groups get the groups which users belong to
snapshotDiff diff two snapshots of a directory or diff the
current directory contents with a snapshot
lsSnapshottableDir list all snapshottable dirs owned by the current user
Use -help to see options
portmap run a portmap service
nfs3 run an NFS version 3 gateway
cacheadmin configure the HDFS cache
crypto configure HDFS encryption zones
storagepolicies list/get/set block storage policies
version print the version
Most commands print help when invoked w/o parameters.
[hadoop@hadoop02 ~]$ hdfs dfs
Usage: hadoop fs [generic options]
[-appendToFile <localsrc> ... <dst>]
[-cat [-ignoreCrc] <src> ...]
[-checksum <src> ...]
[-chgrp [-R] GROUP PATH...]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>]
[-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-count [-q] [-h] <path> ...]
[-cp [-f] [-p | -p[topax]] <src> ... <dst>]
[-createSnapshot <snapshotDir> [<snapshotName>]]
[-deleteSnapshot <snapshotDir> <snapshotName>]
[-df [-h] [<path> ...]]
[-du [-s] [-h] <path> ...]
[-expunge]
[-find <path> ... <expression> ...]
[-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-getfacl [-R] <path>]
[-getfattr [-R] {-n name | -d} [-e en] <path>]
[-getmerge [-nl] <src> <localdst>]
[-help [cmd ...]]
[-ls [-d] [-h] [-R] [<path> ...]]
[-mkdir [-p] <path> ...]
[-moveFromLocal <localsrc> ... <dst>]
[-moveToLocal <src> <localdst>]
[-mv <src> ... <dst>]
[-put [-f] [-p] [-l] <localsrc> ... <dst>]
[-renameSnapshot <snapshotDir> <oldName> <newName>]
[-rm [-f] [-r|-R] [-skipTrash] <src> ...]
[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
[-setfattr {-n name [-v value] | -x name} <path>]
[-setrep [-R] [-w] <rep> <path> ...]
[-stat [format] <path> ...]
[-tail [-f] <file>]
[-test -[defsz] <path>]
[-text [-ignoreCrc] <src> ...]
[-touchz <path> ...]
[-truncate [-w] <length> <path> ...]
[-usage [cmd ...]]
Generic options supported are
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|resourcemanager:port> specify a ResourceManager
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
常用命令介绍:
-help 功能:输出这个命令参数手册 [hadoop@hadoop02 ~]$ hadoop [hadoop@hadoop02 ~]$ hadoop -help [hadoop@hadoop02 ~]$ hadoop fs -help [hadoop@hadoop02 ~]$ hadoop fs -help ls |
hdfs dfsadmin -report #报告整个集群的状态 hdfs dfs -setrep 2 /output/part-r-00000 hadoop fs -count /output/ #统计一个指定目录下的文件节点数量 |
hdfs getconf -namenodes hdfs getconf -confkey [name] #获取配置文件中的name的value ---- hdfs getconf -confkey fs.defaultFS #获取 hdfs集群的入口地址(namenode:客户端的请求和响应) ---- hdfs getconf -confkey hadoop.tmp.dir #获取 临时文件的存储目录 ---- hdfs getconf -confkey dfs.replication #获取 副本数 ---- hdfs getconf -confkey dfs.blocksize #每个块的大小 ---- hdfs getconf -confkey dfs.heartbeat.interval ---- hdfs getconf -confKey heartbeat.recheck.interval -- hdfs getconf -confKey dfs.namenode.heartbeat.recheck-interval |
hdfs dfs -ls / 或者 hdfs dfs -ls hdfs://hadoop01:9000/ #查看hdfs根目录下的所有文件信息 hdfs dfs -ls -R / 或者 hdfs dfs -ls -R hdfs://hadoop01:9000/ #递归查看所有文件信息 |
-mkdir hdfs dfs -mkdir -p /aa/bb/cc/dd #在hdfs上创建目录(-p表示可以创建多级目录)
-cp hdfs dfs -cp /aaa/jdk.tar.gz /bbb/jdk.tar.gz.2
-mv hdfs dfs -mv /aaa/jdk.tar.gz /
-rm hdfs dfs -rm -r /aaa/bbb/ hdfs dfs -rmdir /aaa/bbb/ccc |
-put 和 -copyFromLocal #进行文件上传 hdfs dfs
-get hdfs dfs -get /aaa/jdk.tar.gz /home/hadoop/aaa/jdk.tar.gz |
-getmerge #合并下载多个文件 , 如 hdfs 的目录 /aaa/下有多个文件:log.1, log.2,log.3,... hdfs dfs -getmerge /aaa/log.* ./log.sum
-appendToFile <localsrc> ... <dst> #功能:追加一个文件到已经存在的文件末尾 hdfs dfs -appendToFile ./hello.txt /hello.txt |
-moveFromLocal hdfs dfs - moveFromLocal /home/hadoop/a.txt /aa/bb/cc/dd hdfs dfs - moveToLocal /aa/bb/cc/dd /home/hadoop/a.txt |
-copyFromLocal hdfs dfs -copyFromLocal ./jdk.tar.gz /aaa/ hdfs dfs -copyToLocal /aaa/jdk.tar.gz |
-text #功能:以字符形式打印一个文件的内容,类似于-cat 、-tail的用法 hdfs dfs -text /output/part-r-00000 |