-
Notifications
You must be signed in to change notification settings - Fork 9.2k
HDFS-16286. Add a debug tool to verify the correctness of erasure coding on file #3593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
💔 -1 overall
This message was automatically generated. |
| outputs[i].clear(); | ||
| outputs[i].limit(buffers[0].limit()); | ||
| } | ||
| this.decoder.decode(inputs, erasedIndices, outputs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could make this slightly simpler by just using the encoder, rather than decoder. Simply take the data buffers and encode them to generate the parity. Then you don't need to form the missing indexes array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sodonnel Thanks for your review.
Yes, encoder is better than decoder, fixed.
| blockReaders[i] = blockReader; | ||
| } | ||
| assert checksum != null; | ||
| int bytesPerChecksum = checksum.getBytesPerChecksum(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to adjust the read size based on the checksum size? I think the checksums are validated automatically by the underlying block reader if checksum is enabled, so we should not need to worry about that here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the adjustion based on checksum size is just for performance purpose instead of correctness. The minimum read unit on DN is the checksum size. It will avoid IO waste when the client read is well aligned by checksum.
sodonnel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great additional to the EC tools in HDFS. I just have a couple of comments inline.
Also, do you think you could add a test or two for this new class, as it would help catch issue if someone makes changes later.
|
@sodonnel Thanks for your review. |
|
💔 -1 overall
This message was automatically generated. |
| DFSTestUtil.createFile(fs, new Path(ecDir, "foo_5m"), 5 * m, repl, seed); | ||
| assertTrue(runCmd(new String[]{"verifyEC", "-file", "/ec/foo_5m"}) | ||
| .contains("All EC block group status: OK")); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add one more test case for a file that has multiple block groups, so we test the command looping over more than 1 block? You are using EC 3-2, so write a file that is 6MB, with a 1MB block size. That should create 2 block groups, with a length of 3MB each. Each block would then have a single 1MB EC chunk in it.
In DFSTestUtil there is a method to pass the blocksize already, so the test would be almost the same as the ones above:
public static void createFile(FileSystem fs, Path fileName, int bufferLen,
long fileLen, long blockSize, short replFactor, long seed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that's a good advice, updated.
|
Thanks for the update @cndaimin - There is just one style issue detected and I have one suggestion about adding another test case inside your existing test. Aside from that, I think this change looks good. |
|
@sodonnel Thanks for your review. |
|
🎊 +1 overall
This message was automatically generated. |
sodonnel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM. I will commit this shortly.
|
@cndaimin I was about to commit this, and I remembered we should update the documentation to include this command. The documentation is in a markdown file and gets published with the release, like here: That page is generated from: Would you be able to add a section for this new command under the Debug_Commands section please? |
|
@sodonnel Thanks, documentation file |
|
Thanks, looks good. I will commit when the CI checks come back. |
|
💔 -1 overall
This message was automatically generated. |
… erasure coding on file (apache#3593) (cherry picked from commit a21895a) (cherry picked from commit 2844b98) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DebugAdmin.java Change-Id: If29f2fce5ac3c03b94a6065086a6c700aaf88c8a
…ctness of erasure coding on file (apache#3593) (cherry picked from commit a21895a)
This patch is tested by
hdfs debugtool.Check a good file:
Check a bad file:
Help message: