The metadata checkpointing in HDFS is done by the Secondary NameNode to merge the fsimage and the edits log files periodically and keep edits log size within a limit. For various reasons, the checkpointing by the Secondary NameNode may fail. For one example, HDFS SecondaraNameNode log shows errors in its log as follows.

2017-08-06 10:54:14,488 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint
java.io.IOException: Inconsistent checkpoint fields.
  LV = -63 namespaceID = 1920275013 cTime = 0 ; clusterId = CID-f38880ba-3415-4277-8abf-b5c2848b7a63 ; blockpoolId = BP-578888813-10.6.1.2-1497278556180.
  Expecting respectively: -63; 263120692; 0; CID-d22222fd-e28a-4b2d-bd2a-f60e1f0ad1b1; BP-622207878-10.6.1.2-1497242227638.
  at org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:134)
  at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:531)
  at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:395)
  at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$1.run(SecondaryNameNode.java:361)
  at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
  at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:357)

This post introduces how to force a metadata checkpointing in HDFS.

Step one: Save latest HDFS metadata to the fsimage by the NameNode

On the NameNode, save latest metadata to the fsimage as the HDFS super user (e.g. the user that runs the HDFS daemons) by running following commands:

$ hdfs dfsadmin -safemode enter
$ hdfs dfsadmin -safemode get # to confirm and ensure it is in safemode
$ hdfs dfsadmin -saveNamespace
$ hdfs dfsadmin -safemode leave

Step two: clean the Secondary NameNode old data dir

On the Secondary NameNode as the HDFS super user, stop Secondary NameNode service.

$ hadoop-daemon.sh stop secondarynamenode

Use jps to make sure the secondarynamenode process is indeed stopped.

Find out the value of dfs.namenode.checkpoint.dir for the Secondary NameNode:

$ hdfs getconf -confKey dfs.namenode.checkpoint.dir

An example output is

file:///home/hadoop/tmp/dfs/namesecondary

Then, move/rename the current dir under dfs.namenode.checkpoint.dir so that it can be rebuilt again. For the above example, the command will be

$ mv /home/hadoop/tmp/dfs/namesecondary /home/hadoop/tmp/dfs/namesecondary.old

Step three: force a HDFS metadata checkpointing by the Secondary NameNode

Run following command on the Secondary NameNode:

$ hdfs secondarynamenode -checkpoint force

Then start the secondarynamenode back

$ hadoop-daemon.sh start secondarynamenode

All should be back now.

How to force a metadata checkpointing in HDFS

Step one: Save latest HDFS metadata to the fsimage by the NameNode

Step two: clean the Secondary NameNode old data dir

Step three: force a HDFS metadata checkpointing by the Secondary NameNode

How to operator[] access element in a const map in C++?

How to Change MediaWiki’s Sidebar

How to produce a patch file for a specific git commit?

How to resize a batch of images on Linux?

How to get one process’s port number?

How to print a line to STDERR and STDOUT in PHP?

Leave a Reply Cancel reply

Step one: Save latest HDFS metadata to the fsimage by the NameNode

Step two: clean the Secondary NameNode old data dir

Step three: force a HDFS metadata checkpointing by the Secondary NameNode

Similar Posts

Leave a Reply Cancel reply