Skip to content

Conversation

mmuetzel
Copy link
Member

There can be legit reasons why the sha256 checksum of a version can be empty. E.g., that version could track a "rolling release" of a package.

In most cases, an empty sha256 checksum is probably an oversight that should be fixed. So, emit a warning if it is empty, but continue testing the package.

There can be legit reasons why the sha256 checksum of a version can be
empty. E.g., that version could track a "rolling release" of a package.

In most cases, an empty sha256 checksum is probably an oversight that
should be fixed. So, emit a warning if it is empty, but continue testing
the package.
@mmuetzel mmuetzel mentioned this pull request Aug 29, 2025
3 tasks
@pr0m1th3as
Copy link
Member

pr0m1th3as commented Aug 29, 2025

I suggest we should check for package id such as in

if (strcmp ("dev", pkg_version.id) && isempty (pkg_version.sha256_sum))
  ## do nothing or emit a warning for verbosity, such as
  disp (["sha256 checksum is ignored for dev versions"]);
elseif (! strcmp (sha256_sum, pkg_version.sha256))
  step_error (sprintf (["Package checksum error:\n", ...
      "\n\tFile: %s", ...
      "\n\tExpected: '%s'", ...
      "\n\tBut got:  '%s'\n"], pkg_file, pkg_version.sha256, sha256_sum));
    exit (1);  # Return test failed.
else
    disp (["sha256 checksum ok: '", sha256_sum, "'"]);
endif

This way, empty sha256 strings will only be allowed for dev versions.

@mmuetzel
Copy link
Member Author

Hmm...

Should "rolling releases" be limited to the name "dev"?
We didn't have that case yet. But maybe, there could be packages that like to put out "rolling releases" for their stable branch as some kind of pre-release for testing (while there development branch is already moving further on). Those could potentially have any name.
That is just one example that I could think of when put on the spot. There might be more cases where different "rolling releases" might be used...

@pr0m1th3as
Copy link
Member

pr0m1th3as commented Aug 29, 2025

In @siko1056 's example for a yaml file, only dev tag is mentioned. So, I would expect that maintainers, who want a rolling release, would stick to the given example and use the dev tag. Restricting development versions to a single tag makes things simpler and easier for future maintenance.

Furthermore, leaving it unrestricted, gives the opportunity to publish properly versioned packages without sha256 checksum and just get away with a warning. This opens dark doors from a security perspective. At least, when restricting to the use of dev users are notified that this is a development version and and they can be cautious about it.

@mmuetzel
Copy link
Member Author

mmuetzel commented Aug 29, 2025

Could you please elaborate on which "dark doors" would open?
Afaict, the SHA512 checksum is only displayed on the package index for information purposes.
The CI tries to make sure that that information is correct (if it is shown)*. It is up to a user to check whether the tarball they downloaded actually matches that checksum. I'm not sure how many users are actually doing that manual check. (I'd guess not many.)

`* This PR would add that condition to the CI run.

@pr0m1th3as
Copy link
Member

Your patch allows me to upload statistics version 1.7.6 without any sha256 checksum. If I leave it empty the CI will just emit a warning and I can go ahead and merge. The user does not know that. The user simply types pkg install -forge statistics and get the latest version, which will display as 1.7.6.

It is exactly because most users don't check for checksum that they will see a proper version number and they will assume everything is ok. Whereas, if restricted to dev, the package will have to install in Octave with a dev id.

@mmuetzel
Copy link
Member Author

True.
But displaying the hash on the package index page doesn't actually change anything for users that install the package with package install -forge.

Even if a sha256 checksum is displayed on the package index, a malicious player can still exchange the tarball after the CI ran here. All users that install the package, e.g., with pkg install -forge statistics would install with that changed tarball without noticing that anything changed.
Only users that manually download the tarball, then compare its SHA256 checksum with the one that is displayed on the website and then install from the downloaded tarball with pkg install statistics-1.7.5.tar.gz might notice that the tarball was changed after the original upload.

I'd assume that most users don't do these manual steps. Instead, I guess that most users install with pkg install -forge statistics. If they do that, the SHA256 checksum that is displayed on the website is not taken into account anywhere.

If you prefer to not provide that SHA256 checksum with the initial upload of the package, you can do that without affecting the vast majority of users in any way. Only users that prefer to download the tarball manually wouldn't know with which checksum they should compare. For anyone else, nothing changes.

So, I don't see where "dark doors" open if the CI ignores the checksum if it is empty in the index.

That step in the CI rules still makes sense to make sure that the sha256 checksum is correct if it is set in the .yaml file. Otherwise, that could confuse users that actually download the tarball manually and check its SHA256 hash manually.

@pr0m1th3as
Copy link
Member

It's not about displaying the sha256 checksum on the package index. Ordinary package releases must be checked with the CI and found true with regards to their declared and actual sha256 checksums. This is not possible for developments brach or for cases like continuous releases. My take is that such releases should be all marked as dev in the package description, because this is what gets displayed when a user types pkg list in Octave.

If a maintainer is allowed to provide a proper release name without a checksum, then this can lead to trouble, even if the maintainer does not intend to. It only takes one bad PR to go unnoticed and be merged into main, before it can be installed in a user's system. The checksum is there to prevent third parties from compromising the maintainers' work. So, octave packages without a proper release cycle should not be allowed to be named arbitrarily. This is why the dev option has been given in @siko1056 's example since the very beginning of Octave Packages on GitHub. A package versioned as dev is easier to spot and keep an eye on, because it means that you are installing whatever is in the current sources.

A released package with the checksum approved by the CI is much harder to tamper with, unless the maintainer is the bad actor. But this is not the purpose of sha256 checksum.

@mmuetzel
Copy link
Member Author

mmuetzel commented Aug 29, 2025

We are going in circles.
Given that there are automated checks of the checksum only when running the CI in PRs here, and there are no checks of the checksum when installing packages, they add only very little to secure the deployment chain.
Also, a correct checksum doesn't say anything about potential (accidental) packaging errors or (potentially harmful) bugs in the package. A user installing a package implicitly needs to trust whoever published the package. We decided at some point in the past that we can't make any guarantees when we accept entries in the packaging index.

Imho, we shouldn't impose arbitrary limits on how maintainers would like to present their packages here. We might want to encourage adding the checksum. But I don't see a strong reason why we need to enforce that. Putting a connection between requiring a checksum and a specific name label seems completely arbitrary to me.

But since I see that there are other opinions, I won't merge this PR and will wait for the input from others.

@siko1056
Copy link
Member

siko1056 commented Sep 1, 2025

To some extend I agree to @pr0m1th3as .

The checksums are intended for ensuring integrity of the package installation: What I installed today and works fine, should be what I installed 5 years ago, and in future 10 years. Changes must be tracked as checksum changes in version control.

Due to the lag of signing and trusted keys, etc. the checksums do not extend to cryptographic signatures to prevent "evil players" entirely.

In my opinion, a released package should be bundled once (or deterministic repeatable like in GitHub) and deserves some sort of integrity checksum. This enables to safely copy the archive to other hosting services in private intranets or to machines that do not have internet access, mostly to identify broken packages easily.

Now come the question that is neither wrong or right in my opinion, but more what is asked by users and package maintainers: Should every maintainer be forced to give a release a checksum and call it "dev" otherwise?

In my opinion yes. Rolling releases can be named after the git branch (main, dev, ...) and as @pr0m1th3as said, a user will know that she installs something "not released" or "not repeatable" in 2 weeks or 2 years. In my experience rolling releases are barely useful for simple users just wanting to follow a blog post where versions should be pinned for reproducibility and pro-users should have a way to install other package tags or versions with proper understanding.

Therefore, only the topmost release ID is checked to work to not touch old stuff or dev versions.

But again this is more opinion than a final truth.

I am still unsure how to handle the case if a maintainer wants to just release a single dev release 🤷

Historically the checksum check, was a warning made to an error after a request by a package maintainer slipping through that warning 3c572c9 and here I would like avoid turning in circles 😅

@mmuetzel
Copy link
Member Author

mmuetzel commented Sep 4, 2025

I've currently little time to work on anything Octave related. When I opened this PR, I was thinking this was a small changes that can quickly be dealt with. I didn't anticipate that this could be controversial.

Feel free to close or amend this PR if you prefer. It would probably take a couple of weeks before I can come back to looking into this.

@siko1056 siko1056 added enhancement New feature or request wontfix This will not be worked on labels Oct 14, 2025
@siko1056 siko1056 closed this Oct 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request wontfix This will not be worked on

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants