Skip to content

Conversation

roconnor-blockstream
Copy link
Collaborator

Includes an executable program codex32 whose command correct implements the codex32 error correction algorithm.

@roconnor-blockstream roconnor-blockstream marked this pull request as draft April 4, 2025 18:14
@roconnor-blockstream
Copy link
Collaborator Author

I've added a bit more to the help documentation.

Copy link
Contributor

@BenWestgate BenWestgate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the tests, main, codex32.hs and error.hs.

I'll try to use this to improve my brute force insert and delete correction tool this year.

I built it I wasn't able to pass erasures with -e 5 on the command line. I'm probably doing the syntax wrong.

A better syntax, imho, I used in my brute force tool was to assume '?' in the codex32 string passed were erasures. Seems clunky to specify erasure locations individually.

| len < 48 = failWith $ metaLength ++ " too short."
| 127 < len = failWith $ metaLength ++ " too long."
| specDataLength spec < 6 + payloadLength = failWith $ metaLength ++ " too long for " ++ show (length residue) ++ " character " ++ metaResidue ++ "."
| 15 == length residue && len < 99 = failWith $ metaLength ++ " too short for " ++ show (length residue) ++ " character " ++ metaResidue ++ "."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need the equivalent 13 == length residue && len > 74 (i think)...
There's a no man's land length between the two checksum types where the data isn't valid. (A good reason to change BIP93 to support fewer lengths, perhaps the BIP39 ones and 512.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With fresh eyes, I've rearchitected how the advanced command line arguments are transformed into the specification data needed for error correction.

bitsize = bytesize * 8
failWith str = Opt.handleParseResult . Opt.Failure $ Opt.parserFailure codex32Prefs codex32Options (Opt.ErrorMsg str) [Opt.Context "correct" codex32CorrectParser]
result = errorCorrections (optSpec options) erasureIxs residue
format Nothing = putStrLn "Too many errors. Unable to correct." >> Sys.exitFailure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit pessimistic, if erasure locations were provided the user assuming some likely values could put a correction in reach. "Too many errors and erasures." Reminds the user to try specifying less erasures if possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to address this.

Copy link
Contributor

@BenWestgate BenWestgate Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pessimistic because our wallets.md tells developers to automatically replace invalid characters with '?' in many cases: non-numeric threshold, non-bech32 character, non-"s" index if k=0, repeated share indices (after user affirms they aren't repeating an already entered share).

Any of these silently placed '?' could probably be filled by more carefully reading or typing or cross referencing another share. Additionally, if the identifier was the default bip32 master fingerprint, those erasures are easily filled out of band.

I suggest:

"Too many errors and erasures. Replace some erasures with your best guesses and try again."

@roconnor-blockstream
Copy link
Collaborator Author

roconnor-blockstream commented Oct 17, 2025

I built it I wasn't able to pass erasures with -e 5 on the command line. I'm probably doing the syntax wrong.

Usage: codex32 correct (CODEX32_STRING | --len LENGTH [-e ERASURE_LOCATION] RESIDUE)

You have to pass the --len argument before -e arguments. e.g. --len 48 for 128 bit secrets.

Maybe I should add a choice between specifying the string length or the number of bits in the secret.

Edit: Most definitely the help text needs to illustrate the two different modes of operation and note that you can use ? to mark erasures in the simple mode.

@roconnor-blockstream roconnor-blockstream marked this pull request as draft October 17, 2025 22:00
@roconnor-blockstream
Copy link
Collaborator Author

I converted it to draft because I need to squash my changes, and I want to fix up a few more things. However it is still reviewable.

@roconnor-blockstream
Copy link
Collaborator Author

roconnor-blockstream commented Oct 17, 2025

A better syntax, imho, I used in my brute force tool was to assume '?' in the codex32 string passed were erasures. Seems clunky to specify erasure locations individually.

In the the simple error correction mode, you just use '?' for erasures like you imagine.

The advanced mode is where you are doing paper computing and you got an checksum error, and you don't want to tell your computer what your share is. With the advanced mode you only tell the computer what you incorrect 13 (or 15!) character residue is that you computed was, then you tell the computer where you think erasures might be, then the computer blindly tells you how to correct your string without ever needing access to your share data.

The advanced mode is a way of doing error correction computations while minimizing the information input into a computer. Under the assumption that all errors are equally likely the computer learns no information about your secret share. However, that assumption probably isn't true and in practice tiny fractions of bits of information might in theory be inferred by a malicious computer about the characters at the locations where errors were found.

@BenWestgate
Copy link
Contributor

I built it I wasn't able to pass erasures with -e 5 on the command line. I'm probably doing the syntax wrong.

Usage: codex32 correct (CODEX32_STRING | --len LENGTH [-e ERASURE_LOCATION] RESIDUE)

You have to pass the --len argument before -e arguments. e.g. --len 48 for 128 bit secrets.

Maybe I should add a choice between specifying the string length or the number of bits in the secret.

Edit: Most definitely the help text needs to illustrate the two different modes of operation and note that you can use ? to mark erasures in the simple mode.

in wallets.md we wrote:

ECWs MAY assume the correct length is the closest of 48 or 74.

I've since I've updated guidance to say generating other lengths than 128, 256, 512 is "NOT RECOMMENDED", as well as supporting their import is "NOT RECOMMENDED", then we can strengthen this to ECWs SHOULD assume the correct length is the closest of 48, 74 or (whatever length 512 is).

MAY support for weird lengths like 17 bytes or 31 byte seeds, SHOULD import them if the checksum passes, but MAY error correct them first as if they were 16 or 32 byte seeds. Essentially, no error correction at all, assuming they're mistranscriptions of 16 and 32 byte seeds respectively, only testing the original length (and others nearby) if there are no valid corrections within edit distance limits for 16 and 32 byte candidates.

This was originally in regards to insert/delete correction, but it seems it applies just as well to erasure and error correction in your implementation. Fixing a delete is just a matter of trying 48 erasure positions.

@BenWestgate
Copy link
Contributor

BenWestgate commented Oct 18, 2025

You have to pass the --len argument before -e arguments. e.g. --len 48 for 128 bit secrets.

Maybe I should add a choice between specifying the string length or the number of bits in the secret.

Eliminate the string length parameter entirely.

It is specialist knowledge, I barely remember the length of 32- or 64-byte codex32 strings off the top of my head and wrote a codex32 PyPI package and years ago a codex32 error correcting wallet. We shouldn't expect users to know. wallets.md says "error correcting wallets MAY assume the correct length is the closest of..." So do this, otherwise let them specify a seed length in bytes, only whole bytes are valid.

No string length parameter as there are two invalid lengths in the middle. Prevent user mistakes. *if they have to count the characters, they definitely don't know their seed byte length, so we should assume for them.

@BenWestgate
Copy link
Contributor

BenWestgate commented Oct 18, 2025

The advanced mode is a way of doing error correction computations while minimizing the information input into a computer.

I will probably need to use it this way in my wallet because it runs on a restricted platform Tails OS and users can't by default install the packages needed to build Haskell code. Would ship a signed binary, and then not hand it secret data during error correction.

the computer blindly tells you how to correct your string without ever needing access to your share data.

It's also possible to one-time-pad encrypt the string by a random valid string and pass that to simple mode. Not sure what is easier by hand. Assume the encryption string is generated with a different offline PC. Do you know? That may affect whether "advanced mode" is useful or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants