Skip to content

Conversation

alhendrickson
Copy link
Collaborator

@alhendrickson alhendrickson commented Jun 26, 2025

MedCAT 2 Test build improved from 20 minutes to 10 minutes.

Using default caching described in https://docs.astral.sh/uv/guides/integration/github/#caching.

Added mandatory uv.lock file, it's good practice anyway. One todo is add a local dev guide, unless I'm missing it.

@alhendrickson alhendrickson marked this pull request as ready for review June 26, 2025 13:41
@mart-r
Copy link
Collaborator

mart-r commented Jun 26, 2025

Just as a note, all but the 3.9 workflows already ran in around 12 minutes. e.g:
https://github.com/CogStack/cogstack-nlp/actions/runs/15902962856/job/44850224953

But still, getting 3.9 down to the same level is definitely a win!

@alhendrickson alhendrickson merged commit 8aa249b into main Jun 26, 2025
11 checks passed
@alhendrickson alhendrickson deleted the build/add-caching branch June 26, 2025 16:30
@mart-r
Copy link
Collaborator

mart-r commented Jun 26, 2025

What purpose does committing the uv.lock serve for us?
We define our compatibility constraints in pyproject.toml. Why do we need to duplicate (some of) that information in uv.lock?

I understand if we'd be building application, it's great to pin everything to specific versions for consitency. But we're working on a library. So we don't want to pin down our dependencies - we want to give the user the flexibility of using whatever they already have within their environment as long as it fits within the constraints we've defined in pyproject.toml.

I'm not saying we can't have uv.lock in the project. I'm just not sure what the value is. And there might be value that I'm not seeing at this point in time.

With that said, if we do opt to keep it, I think we need some kind of automation to make sure it's up to date. For instance, we we need to change dependencies or there's a new python release we want to include or ignore.

@alhendrickson
Copy link
Collaborator Author

As a high level assumption - when users install your library (eg pip install medcat), it doesnt use the uv.lock file, just what you have in pyproject. If that assumption is wrong then it completely throws the rest of what I'm about to say!

The value is exactly as you say in pinning it for consistency. The build 100% matches what you have locally, and it's always exactly the same. If you mean for users to checkout the medcat project for local dev, then I think the lock is helpful anyway so there's no way people get different results to what you get.

My experience here is npm in javascript with package-lock.json, pretty sure it exactly the same. NPM does it automatically by default every time you npm install something, so it's super standard, I'd say basically any web page these days will have a package-lock.json in their repo.

To keep consistent, I imagine UV has something to also make it default to updating it, worst case we add a precommit or something to run uv lock

@alhendrickson
Copy link
Collaborator Author

alhendrickson commented Jun 26, 2025

Oh extra - to explain why I put it in, the caching didnt work without it, it errors out saying it needs the lock file. I didn't plan to add it, though I'm pretty sure it's good practice anyway

@mart-r
Copy link
Collaborator

mart-r commented Jun 26, 2025

I guess my point was that I wasn't quite sure what the target of it was. I.e - like you said - it doesn't get propagated to the end user in any way (and we wouldn't want it to).

And the other concern was the fear of it becoming another thing to maintain. Though that's certainly a solveable problem, either through a pre-commit hook or a workflow, or perhaps even something else.
For some reason I'd lean towards having a workflow that checks the uv.lock to make sure it doesn't need upgrading and it would then fall for the dev to implement the change (though this would likely be more or less automated anyway).

With that said, I'm not sure the lock file is necessarily too useful at dev time since (as far as I know) it's not portable on different python versions or OSs or CPU architecture, or with and without GPU (cuda) stuff. Though definitely useful for debugging complex issues to ensure compatible environments.

So it sounds like we'll keep it for:

  • Caching
  • Consistency in terms of CI

And I think we should plan a for a way to ensure it's up to date in one way or the other.

EDIT:
I think the pre-commit hook isn't really doable since we can't enforce everyone doing a PR to have uv installed. And the same goes to having this updated automatically when adding a dependency. As much as you can do it with uv (in which case both pyproject.toml and uv.lock get automatically updated), you can also do it without - and I don't think we necessarily want to limit that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants