hadoophadoop2distcp

Hadoop copy from cluster to cluster fails due to "Mismatch in length of source"


I want to copy data from one to another cluster. I use this command

hadoop distcp hdfs://SOURCE-NAMENODE:9000/dir/ \ hdfs://DESTINATION-NAMENODE:9000/

And I get this message:

18/04/11 12:05:37 INFO mapred.CopyMapper: Copying hdfs://SOURCE-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108 to hdfs://DESTINATION-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108 18/04/11 12:05:37 INFO mapred.RetriableFileCopyCommand: Creating temp file: hdfs://DESTINATION-NAMENODE:9000/.distcp.tmp.attempt_local2084770019_0001_m_000000_0 18/04/11 12:05:38 ERROR util.RetriableCommand: Failure in Retriable command: Copying hdfs://SOURCE-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108 to hdfs://DESTINATION-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108 java.io.IOException: Mismatch in length of source:hdfs://SOURCE-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108 and target:hdfs://DESTINATION-NAMENODE:9000/.distcp.tmp.attempt_local2084770019_0001_m_000000_0 at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.compareFileLengths(RetriableFileCopyCommand.java:193)...

On destination I only see directories created and none of files.

Any ideas?


Solution

  • That's probably due to the fact you are copying a file that's being written to.