Releases: CogStack/cogstack-nlp
medcat/v2.1.0
Minor release.
Highlighted features:
- Offline loading of BERT based MetaCATs (#67, #85)
- Allow loading models with config dicts (applied before pipe init) again (#53)
- Simplified access and imports (#112, #119), e.g:
CAT.pipe
instead ofCAT._pipeline
from medcat.stats import get_stats
instead offrom medcat.stats.stats import get_stats
- Improved supervised training flexibility by logging issues (#121)
Important bug fixes
- Stats edge cases for 0 precion and recall (#109)
- Fix issue with model pack removal upon save in some situations (#115)
What's Changed
- Add permissions for MedCAT release by @mart-r in #42
- Remove install bundles before pushing to PyPI by @mart-r in #44
- Allow patch releases for pre-releases by @mart-r in #43
- Update to valid classifiers by @mart-r in #45
- CU-8699twteb: Update docs links to point to up to date page by @mart-r in #48
- Fix license in pyproject.toml by @mart-r in #49
- Improve some logging in MetaCAT when no suitable category is found by @mart-r in #50
- CU-8699upt9a Allow saving output onto disk when multiprocessing by @mart-r in #52
- CU-8699vkmu4: Allow load with merging config(s) by @mart-r in #53
- CU-8699vq0he: Improve addon access from CAT by @mart-r in #55
- CU-8699vnuwf Ignore hidden files when loading model packs by @mart-r in #54
- CU-8699zxxnt: Fix v2 tutorials link to point to correct version by @mart-r in #61
- Adding functionality for offline loading by @shubham-s-agarwal in #67
- Remove unnecessary method from tutorial example by @mart-r in #73
- CU-8698x63kt: Add v2 migration guide by @mart-r in #66
- CU-8699mrvup docs: update urls throughout to point to new cogstack-nlp repo by @alhendrickson in #71
- Update dependabot.yml to update github actions by @alhendrickson in #81
- Avoid v2 release workflow run in case of v1 release by @mart-r in #83
- CU-8699wc4zb Port offline BERT MetaCAT load to v2 by @mart-r in #85
- build: Update dependabot config by @alhendrickson in #91
- build: Update dependabot config. Add commit prefix by @alhendrickson in #93
- CU-869a6w9c7 Fix stats on 0 prec and 0 rec by @mart-r in #109
- CU-869a6v8qd: Fix tutorial links by @mart-r in #108
- Explicitly specify an empty HF cache during testing of offline load by @mart-r in #106
- CU-869a71q73: Rename multi-text method and deprecate old one by @mart-r in #110
- CU-869a2kpv0 Add method for model card load off disk by @mart-r in #111
- build: bump the actions-deps group with 6 updates by @dependabot[bot] in #94
- chore(medcat): CU-869a971xa: Update readme by @tomolopolis in #116
- CU-869a95nu1 Fix spacy model cleanup by @mart-r in #115
- CU-8699qzfdk Improve optional part checks by @mart-r in #113
- CU-869a7mjaa: Add simplified method of getting pipe from CAT object by @mart-r in #112
- chore(medcat): CU-869a98zwq: use old name by @tomolopolis in #118
- CU-869a9mten Improve duplicate name imports by @mart-r in #119
- CU-869a9q6rm: Include MetaCAT model cards in overall model card by @mart-r in #120
- CU-869a9w9v8: Allow a warning instead of a raised exception when doing supervised training by @mart-r in #121
- Medcat conversion model name hotfix by @mart-r in #122
New Contributors
- @shubham-s-agarwal made their first contribution in #67
- @dependabot[bot] made their first contribution in #94
Full Changelog: medcat/v2.0.0...medcat/v2.1.0
medcat/v2.0.0
We’re excited to announce the release of MedCAT v2. This is a major refactor that brings a more modular, flexible, and maintainable foundation for clinical NLP, while staying compatible with existing v1 models.
This release focuses on:
- Refactored structure for lower coupling and greater extensibility
- Modularity via optional install extras (install only what you need)
- Improved flexibility in tokenization, NER, and annotation pipelines
- Backwards compatibility for v1 models, with automatic conversion
✨ What’s New
- Decoupled from
spacy
→ now possible to use lightweight regex tokenizer or other (custom) backends - Optional extras: install support only for the components you need (
spacy
,meta-cat
,deid
,rel-cat
,dict-ner
) - Training is now structured around dedicated classes for clearer workflows
- Tutorials and scripts have been rebuilt from the ground up for v2
- Added support for a supervised training web service (experimental, under development)
⚠️ Breaking Changes
- Saving/Loading:
- Save method has a new name (
CAT.save_model_pack
) - v2 saves models in a new format (but still loads v1 models, with slower load times due to conversion)
- Save method has a new name (
- Training:
- Training APIs now go through separate trainer classes
- Defaults:
- Default install no longer includes spacy or advanced components (see migration guide for how to enable them)
For a complete list, see: BREAKING_CHANGES.md
📖 Migration Guide
If you’re upgrading from v1, please read the dedicated Migration Guide. It covers:
- Installation instructions
- Changes to saving/loading
- v1 model compatibility notes
- Updated tutorials and example scripts
- FAQ and troubleshooting
🔗 Useful Links
📦 PyPI
🛠️ Repository
Feedback
v2 is a big step forward, and we’d love your input!
Please open a GitHub issue or join the discussion forum for:
- Missing documentation
- Bugs or breaking behaviour
- Feedback on error/log messages
- Suggestions for future improvements
MedCAT v1.16.5
Mostly workflow changes to release #67 .
What's Changed
- Adding functionality for offline loading by @shubham-s-agarwal in #67
- Fix typo in v1 production workflow by @mart-r in #80
Full Changelog: medcat/v1.16.0...medcat/v1.16.5
medcat/v2.0.0b4
There's a fair few fixes in this patch / beta release.
Most notably, some bug fixes for multiprocessing and some quality of life changes for that as well as well as other QoL changes.
What's Changed
- CU-8699twteb: Update docs links to point to up to date page by @mart-r in #48
- CU-8699rvhe9 Refer to PyPI medcat v2 by @mart-r in #46
- CU-8699td0xq: Move to v2 model pack by @mart-r in #47
- Fix license in pyproject.toml by @mart-r in #49
- Improve some logging in MetaCAT when no suitable category is found by @mart-r in #50
- CU-8699upt9a Allow saving output onto disk when multiprocessing by @mart-r in #52
- CU-8699vkmu4: Allow load with merging config(s) by @mart-r in #53
- CU-8699vq0he: Improve addon access from CAT by @mart-r in #55
- CU-8699vnuwf Ignore hidden files when loading model packs by @mart-r in #54
Full Changelog: medcat/v2.0.0b3...medcat/v2.0.0b4
medcat/v2.0.0b3
First official PyPI-available beta release of MedCAT 2.0.
Full Changelog: medcat/v2.0.0b2...medcat/v2.0.0b3
medcat/v2.0.0b2
Third attempt as a PyPI release for 2.0.0 beta.
Full Changelog: medcat/v2.0.0b1...medcat/v2.0.0b2
medcat/v2.0.0b1
Second attempt at a 2.0 beta release.
What's Changed
Full Changelog: medcat/v2.0.0b0...medcat/v2.0.0b1
medcat/v2.0.0b0
First attempt at a 2.0 beta release.
What's Changed
- CU-8699nbgbh Test against changes by @mart-r in #26
- CU-8699rg5cc: Add new workflow to publish to PyPI using TPM by @mart-r in #40
Full Changelog: medcat/v0.13.4...medcat/v2.0.0b0
medcat/v0.13.5
This PR finally fixes the multiprocessing issue on Linux (#38)
Full Changelog: medcat/v0.13.4...medcat/v0.13.5
medcat/v0.13.4
Patch release to fix an issue with negative examples in supervised training data sets.