我遇到的所有问题都能在网上找到

前提条件

你已经创建了3台虚拟机
每台虚拟机都安装了jdk和Hadoop

启动

将3台虚拟机都打开在主节点运行

1 2	start-dfs.sh start-yarn.sh

虚拟机会输出

zhihaojiang@linux-24-10:~$ start-dfs.sh
Starting namenodes on [linux-24-10]
Starting datanodes
Starting secondary namenodes [linux-24-10]
2025-09-24 05:27:13,484 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

zhihaojiang@linux-24-10:~$ start-yarn.sh
Starting resourcemanager
Starting nodemanagers

不用管warning

然后输入jps
主节点的虚拟机会输出

zhihaojiang@linux-24-10:~$ jps
8245 DataNode
8071 NameNode
8969 NodeManager
8633 ResourceManager
8445 SecondaryNameNode
9134 Jps

主要是看是否有这五个东西

从节点的虚拟机会输出

zhihaojiang@linux-24-10-node4:~$ jps
3675 NodeManager
3516 DataNode
3806 Jps

看到这些都存在那么你的Hadoop基本上是启动成功了

检查部署是否正常

在主节点输入

1	hdfs dfsadmin -report

其会输出

zhihaojiang@linux-24-10:~$ hdfs dfsadmin -report
2025-09-24 05:31:46,974 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 31392067584 (29.24 GB)
Present Capacity: 7492927488 (6.98 GB)
DFS Remaining: 7492829184 (6.98 GB)
DFS Used: 98304 (96 KB)
DFS Used%: 0.00%
Replicated Blocks:
	Under replicated blocks: 0
	Blocks with corrupt replicas: 0
	Missing blocks: 0
	Missing blocks (with replication factor 1): 0
	Low redundancy blocks with highest priority to recover: 0
	Pending deletion blocks: 0
Erasure Coded Block Groups: 
	Low redundancy block groups: 0
	Block groups with corrupt internal blocks: 0
	Missing block groups: 0
	Low redundancy blocks with highest priority to recover: 0
	Pending deletion blocks: 0

-------------------------------------------------
Live datanodes (3):

Name: 172.16.79.100:9866 (linux-24-10-node2)
Hostname: linux-24-10-node2
Decommission Status : Normal
Configured Capacity: 10464022528 (9.75 GB)
DFS Used: 32768 (32 KB)
Non DFS Used: 7416832000 (6.91 GB)
DFS Remaining: 2493509632 (2.32 GB)
DFS Used%: 0.00%
DFS Remaining%: 23.83%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 0
Last contact: Wed Sep 24 05:31:45 UTC 2025
Last Block Report: Wed Sep 24 05:27:09 UTC 2025
Num of Blocks: 0


Name: 172.16.79.120:9866 (linux-24-10-node4)
Hostname: linux-24-10-node4
Decommission Status : Normal
Configured Capacity: 10464022528 (9.75 GB)
DFS Used: 32768 (32 KB)
Non DFS Used: 7425851392 (6.92 GB)
DFS Remaining: 2484490240 (2.31 GB)
DFS Used%: 0.00%
DFS Remaining%: 23.74%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 0
Last contact: Wed Sep 24 05:31:45 UTC 2025
Last Block Report: Wed Sep 24 05:27:09 UTC 2025
Num of Blocks: 0


Name: 172.16.79.129:9866 (linux-24-10)
Hostname: linux-24-10
Decommission Status : Normal
Configured Capacity: 10464022528 (9.75 GB)
DFS Used: 32768 (32 KB)
Non DFS Used: 7395512320 (6.89 GB)
DFS Remaining: 2514829312 (2.34 GB)
DFS Used%: 0.00%
DFS Remaining%: 24.03%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 0
Last contact: Wed Sep 24 05:31:45 UTC 2025
Last Block Report: Wed Sep 24 05:27:09 UTC 2025
Num of Blocks: 0

Live datanodes (3):

说明你有三台机器都存活
或者在浏览器输入

1 2	虚拟机IP:8088 虚拟机IP:9870

可以看到两个界面

在9870端口的界面中Overview右边有你虚拟机的名称和一个(active)
若有(✅active)说明成功启动的

检查运行是否正常

我们创建一个文件

1 2	zhihaojiang@linux-24-10:~$ nano input.txt # 使用vim也行我喜欢用nano

随便在文件中写些什么东西例如

This is a dfs test.
Another line with dfs.
No match here.
dfs appears again.

之后推送到Hadoop上

1 2	# 路径换成自己的路径 zhihaojiang@linux-24-10:~$ hdfs dfs -mkdir -p /user/zhihaojiang/grep/input

解释上述命令

我解释下上述命令
hdfs dfs
调用 Hadoop 的HDFS 文件系统客户端工具用于操作 HDFS

-mkdir
HDFS 的创建目录命令对应 Linux 的mkdir

-p
递归创建父目录

/user/zhihaojiang/grep/input
你自己想要在 HDFS 上创建的完整目录路径记得路径换成自己的路径

运行

1 2	# 路径换成自己的路径 hdfs dfs -put input.txt /user/zhihaojiang/grep/input/

1 2	# 路径换成自己的路径 hdfs dfs -ls /user/zhihaojiang/grep/input

显示

zhihaojiang@linux-24-10:~$ hdfs dfs -put input.txt /user/zhihaojiang/grep/input/
2025-09-24 05:56:03,010 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
zhihaojiang@linux-24-10:~$ hdfs dfs -ls /user/zhihaojiang/grep/input
2025-09-24 05:56:10,664 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
-rw-r--r--   3 zhihaojiang supergroup         77 2025-09-24 05:56 /user/zhihaojiang/grep/input/input.txt

重点看

-rw-r–r– 3 zhihaojiang supergroup 77 2025-09-24 05:56 /user/zhihaojiang/grep/input/input.txt

这说明已经确认文件已存在副本数 = 3

接下来我们测试官方的Grep示例

# 路径换成自己的路径
hadoop jar ~/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep \
  /user/zhihaojiang/grep/input \
  /user/zhihaojiang/grep/output \
  'dfs[a-z.]*'

他会输出很多东西但你只需要看到

1	2025-09-24 06:24:24,842 INFO mapreduce.Job: Job job_1758694801292_0002 completed successfully

你还可以运行

1	zhihaojiang@linux-24-10:~/hadoop/etc/hadoop$ hdfs dfs -cat /user/zhihaojiang/grep/output/part-r-00000

出现completed successfully说明运行成功你也可以在浏览器上刷新8088端口出现下述黄色框中的就成功运行了

但是在进行

测试

接下来我们测试官方的Grep示例

# 路径换成自己的路径
hadoop jar ~/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep \
  /user/zhihaojiang/grep/input \
  /user/zhihaojiang/grep/output \
  'dfs[a-z.]*'

会存在一些问题

问题1

例如运行后显示

zhihaojiang@linux-24-10:~$ hadoop jar ~/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep \
  /user/zhihaojiang/grep/input \
  /user/zhihaojiang/grep/output \
  'dfs[a-z.]*'
2025-09-24 06:11:23,729 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2025-09-24 06:11:24,049 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /172.16.79.129:8032
2025-09-24 06:11:24,396 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/zhihaojiang/.staging/job_1758691653973_0001
2025-09-24 06:11:24,669 INFO input.FileInputFormat: Total input files to process : 1
2025-09-24 06:11:24,723 INFO mapreduce.JobSubmitter: number of splits:1
2025-09-24 06:11:24,829 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1758691653973_0001
2025-09-24 06:11:24,829 INFO mapreduce.JobSubmitter: Executing with tokens: []
2025-09-24 06:11:24,934 INFO conf.Configuration: resource-types.xml not found
2025-09-24 06:11:24,934 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2025-09-24 06:11:25,462 INFO impl.YarnClientImpl: Submitted application application_1758691653973_0001
2025-09-24 06:11:25,499 INFO mapreduce.Job: The url to track the job: http://linux-24-10:8088/proxy/application_1758691653973_0001/
2025-09-24 06:11:25,499 INFO mapreduce.Job: Running job: job_1758691653973_0001
2025-09-24 06:11:30,609 INFO mapreduce.Job: Job job_1758691653973_0001 running in uber mode : false
2025-09-24 06:11:30,612 INFO mapreduce.Job:  map 0% reduce 0%
2025-09-24 06:11:30,648 INFO mapreduce.Job: Job job_1758691653973_0001 failed with state FAILED due to: Application application_1758691653973_0001 failed 2 times due to AM Container for appattempt_1758691653973_0001_000002 exited with  exitCode: 1
Failing this attempt.Diagnostics: [2025-09-24 06:11:29.653]Exception from container-launch.
Container id: container_1758691653973_0001_02_000001
Exit code: 1

[2025-09-24 06:11:29.674]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Please check whether your <HADOOP_HOME>/etc/hadoop/mapred-site.xml contains the below configuration:
<property>
  <name>yarn.app.mapreduce.am.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
  <name>mapreduce.map.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
  <name>mapreduce.reduce.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>

[2025-09-24 06:11:29.675]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Please check whether your <HADOOP_HOME>/etc/hadoop/mapred-site.xml contains the below configuration:
<property>
  <name>yarn.app.mapreduce.am.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
  <name>mapreduce.map.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
  <name>mapreduce.reduce.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>

For more detailed output, check the application tracking page: http://linux-24-10:8088/cluster/app/application_1758691653973_0001 Then click on links to logs of each attempt.
. Failing the application.
2025-09-24 06:11:30,675 INFO mapreduce.Job: Counters: 0
2025-09-24 06:11:30,707 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /172.16.79.129:8032
2025-09-24 06:11:30,731 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/zhihaojiang/.staging/job_1758691653973_0002
2025-09-24 06:11:30,836 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/zhihaojiang/.staging/job_1758691653973_0002
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://172.16.79.129:9000/user/zhihaojiang/grep-temp-212921694
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:340)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:279)
	at org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:59)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:404)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:310)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:327)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200)
	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1678)
	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1675)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1675)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1696)
	at org.apache.hadoop.examples.Grep.run(Grep.java:94)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
	at org.apache.hadoop.examples.Grep.main(Grep.java:103)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
	at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
	at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:328)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:241)
Caused by: java.io.IOException: Input path does not exist: hdfs://172.16.79.129:9000/user/zhihaojiang/grep-temp-212921694
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313)
	... 29 more

这是因为Hadoop MapReduce 作业无法启动 ApplicationMaster 根本原因是环境变量 HADOOP_MAPRED_HOME 未正确配置
错误信息

1	Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster

解决方法

1	cd ~/hadoop/etc/hadoop

查看是否已有 mapred-site.xml

1	ls mapred-site.xml

编辑文件

1	nano mapred-site.xml

添加关键配置
在标签内添加以下 3 个属性：

<configuration>
  <!-- 其他已有配置... -->

  <property>
    <name>yarn.app.mapreduce.am.env</name>
    <value>HADOOP_MAPRED_HOME=/home/zhihaojiang/hadoop</value>
  </property>

  <property>
    <name>mapreduce.map.env</name>
    <value>HADOOP_MAPRED_HOME=/home/zhihaojiang/hadoop</value>
  </property>

  <property>
    <name>mapreduce.reduce.env</name>
    <value>HADOOP_MAPRED_HOME=/home/zhihaojiang/hadoop</value>
  </property>

</configuration>

注意路径一定是绝对路径

同步配置到所有节点

1
2
3

# 我的从节点主机名是 linux-24-10-node2 和 linux-24-10-node4
scp ~/hadoop/etc/hadoop/mapred-site.xml linux-24-10-node2:~/hadoop/etc/hadoop/
scp ~/hadoop/etc/hadoop/mapred-site.xml linux-24-10-node4:~/hadoop/etc/hadoop/

重启 YARN

1
2
3

# 在主节点执行
stop-yarn.sh
start-yarn.sh

重新运行 Grep 示例

# 先删除旧的输出目录（如果存在）
hdfs dfs -rm -r /user/zhihaojiang/grep/output

# 重新运行
hadoop jar ~/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep \
  /user/zhihaojiang/grep/input \
  /user/zhihaojiang/grep/output \
  'dfs[a-z.]*'

问题2

zhihaojiang@linux-24-10:~/hadoop/etc/hadoop$ hadoop jar ~/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep \
  /user/zhihaojiang/grep/input \
  /user/zhihaojiang/grep/output \
  'dfs[a-z.]*'
2025-09-24 06:15:54,235 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2025-09-24 06:15:54,535 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /172.16.79.129:8032
2025-09-24 06:15:54,836 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/zhihaojiang/.staging/job_1758694526692_0001
2025-09-24 06:15:55,078 INFO input.FileInputFormat: Total input files to process : 1
2025-09-24 06:15:55,133 INFO mapreduce.JobSubmitter: number of splits:1
2025-09-24 06:15:55,215 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1758694526692_0001
2025-09-24 06:15:55,215 INFO mapreduce.JobSubmitter: Executing with tokens: []
2025-09-24 06:15:55,339 INFO conf.Configuration: resource-types.xml not found
2025-09-24 06:15:55,339 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2025-09-24 06:15:55,726 INFO impl.YarnClientImpl: Submitted application application_1758694526692_0001
2025-09-24 06:15:55,768 INFO mapreduce.Job: The url to track the job: http://linux-24-10:8088/proxy/application_1758694526692_0001/
2025-09-24 06:15:55,768 INFO mapreduce.Job: Running job: job_1758694526692_0001
2025-09-24 06:16:00,853 INFO mapreduce.Job: Job job_1758694526692_0001 running in uber mode : false
2025-09-24 06:16:00,857 INFO mapreduce.Job:  map 0% reduce 0%
2025-09-24 06:16:01,920 INFO mapreduce.Job: Task Id : attempt_1758694526692_0001_m_000000_0, Status : FAILED
Container launch failed for container_1758694526692_0001_01_000002 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateExceptionImpl(SerializedExceptionPBImpl.java:171)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:182)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
	at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:163)
	at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:394)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

2025-09-24 06:16:03,974 INFO mapreduce.Job: Task Id : attempt_1758694526692_0001_m_000000_1, Status : FAILED
Container launch failed for container_1758694526692_0001_01_000003 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateExceptionImpl(SerializedExceptionPBImpl.java:171)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:182)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
	at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:163)
	at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:394)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

2025-09-24 06:16:06,012 INFO mapreduce.Job: Task Id : attempt_1758694526692_0001_m_000000_2, Status : FAILED
Container launch failed for container_1758694526692_0001_01_000004 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateExceptionImpl(SerializedExceptionPBImpl.java:171)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:182)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
	at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:163)
	at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:394)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

2025-09-24 06:16:09,053 INFO mapreduce.Job:  map 100% reduce 100%
2025-09-24 06:16:10,090 INFO mapreduce.Job: Job job_1758694526692_0001 failed with state FAILED due to: Task failed task_1758694526692_0001_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0 killedReduces: 0

2025-09-24 06:16:10,179 INFO mapreduce.Job: Counters: 10
	Job Counters 
		Failed map tasks=4
		Killed reduce tasks=1
		Launched map tasks=4
		Other local map tasks=3
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=2
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=2
		Total vcore-milliseconds taken by all map tasks=2
		Total megabyte-milliseconds taken by all map tasks=2048
2025-09-24 06:16:10,203 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /172.16.79.129:8032
2025-09-24 06:16:10,251 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/zhihaojiang/.staging/job_1758694526692_0002
2025-09-24 06:16:10,287 INFO input.FileInputFormat: Total input files to process : 0
2025-09-24 06:16:10,723 INFO mapreduce.JobSubmitter: number of splits:0
2025-09-24 06:16:10,793 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1758694526692_0002
2025-09-24 06:16:10,793 INFO mapreduce.JobSubmitter: Executing with tokens: []
2025-09-24 06:16:11,021 INFO impl.YarnClientImpl: Submitted application application_1758694526692_0002
2025-09-24 06:16:11,029 INFO mapreduce.Job: The url to track the job: http://linux-24-10:8088/proxy/application_1758694526692_0002/
2025-09-24 06:16:11,029 INFO mapreduce.Job: Running job: job_1758694526692_0002
2025-09-24 06:16:19,174 INFO mapreduce.Job: Job job_1758694526692_0002 running in uber mode : false
2025-09-24 06:16:19,182 INFO mapreduce.Job:  map 0% reduce 0%
2025-09-24 06:16:22,264 INFO mapreduce.Job: Task Id : attempt_1758694526692_0002_r_000000_0, Status : FAILED
Container launch failed for container_1758694526692_0002_01_000002 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateExceptionImpl(SerializedExceptionPBImpl.java:171)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:182)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
	at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:163)
	at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:394)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

2025-09-24 06:16:24,312 INFO mapreduce.Job: Task Id : attempt_1758694526692_0002_r_000000_1, Status : FAILED
Container launch failed for container_1758694526692_0002_01_000003 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateExceptionImpl(SerializedExceptionPBImpl.java:171)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:182)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
	at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:163)
	at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:394)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

2025-09-24 06:16:27,344 INFO mapreduce.Job: Task Id : attempt_1758694526692_0002_r_000000_2, Status : FAILED
Container launch failed for container_1758694526692_0002_01_000004 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateExceptionImpl(SerializedExceptionPBImpl.java:171)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.instantiateException(SerializedExceptionPBImpl.java:182)
	at org.apache.hadoop.yarn.api.records.impl.pb.SerializedExceptionPBImpl.deSerialize(SerializedExceptionPBImpl.java:106)
	at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:163)
	at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:394)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

2025-09-24 06:16:31,375 INFO mapreduce.Job:  map 0% reduce 100%
2025-09-24 06:16:32,399 INFO mapreduce.Job: Job job_1758694526692_0002 failed with state FAILED due to: Task failed task_1758694526692_0002_r_000000
Job failed as tasks failed. failedMaps:0 failedReduces:1 killedMaps:0 killedReduces: 0

2025-09-24 06:16:32,439 INFO mapreduce.Job: Counters: 7
	Job Counters 
		Failed reduce tasks=4
		Launched reduce tasks=4
		Total time spent by all maps in occupied slots (ms)=0
		Total time spent by all reduces in occupied slots (ms)=2
		Total time spent by all reduce tasks (ms)=2
		Total vcore-milliseconds taken by all reduce tasks=2
		Total megabyte-milliseconds taken by all reduce tasks=2048

错误信息

1	org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist

这是YARN未正确配置MapReduce Shuffle服务导致的
YARN 需要知道如何为MapReduce任务提供Shuffle服务用于 Map → Reduce 的数据传输
这个服务叫 mapreduce_shuffle 必须在所有 NodeManager 节点的 yarn-site.xml 中显式启用
我们需要配置yarn-site.xml

编辑主节点的 yarn-site.xml

1 2	cd ~/hadoop/etc/hadoop nano yarn-site.xml

在标签内添加以下属性

<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>

<property>
  <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
  <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

将配置同步到所有从节点

# 复制到两个从节点 用你的实际主机名或 IP）
scp yarn-site.xml linux-24-10-node2:~/hadoop/etc/hadoop/
scp yarn-site.xml linux-24-10-node4:~/hadoop/etc/hadoop/

# 或者使用IP
# scp yarn-site.xml zhihaojiang@172.16.79.100:~/hadoop/etc/hadoop/
# scp yarn-site.xml zhihaojiang@172.16.79.120:~/hadoop/etc/hadoop/

重启 YARN 服务

1
2
3

# 在主节点执行
stop-yarn.sh
start-yarn.sh

验证 NodeManager 是否加载了 shuffle 服务

1
2
3

# 查看最新 NodeManager 日志
ls -lt ~/hadoop/logs/ | grep nodemanager
cat ~/hadoop/logs/hadoop-zhihaojiang-nodemanager-*.log | grep -i shuffle

这会输出很多东西只要看到有

1	Registered auxiliary service mapreduce_shuffle, service class org.apache.hadoop.mapred.ShuffleHandler

就说明加载了 shuffle 服务

重新运行 Grep 示例

# 先删除旧的输出目录（如果存在）
hdfs dfs -rm -r /user/zhihaojiang/grep/output

# 重新运行
hadoop jar ~/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep \
  /user/zhihaojiang/grep/input \
  /user/zhihaojiang/grep/output \
  'dfs[a-z.]*'

智浩的Blog

Hadoop部署--验证分布式是否成功

前提条件

启动

检查部署是否正常

检查运行是否正常

问题1

问题2