Machine Learning toolkit for Natural Language Processing.
Written for LxMLS - Lisbon Machine Learning Summer School
- Scientific Python and Mathematical background
- Linear Classifiers (Gradient Descent)
- Feed-forward models in deep learning (Backpropagation)
- Sequence models in deep learning
- Attention Models (Transformers)
- Multimodal Models
Note
Bear in mind that the main purpose of the toolkit is educational. You may resort to other toolboxes if you are looking for efficient implementations of the algorithms described.
Important
Use the student branch not this one 🚨!
Download the code. If you are used to git just clone the student branch. For example from the command line in do
git clone https://github.com/LxMLS/lxmls-toolkit.git lxmls-toolkit-student
cd lxmls-toolkit-student
git checkout studentInstall uv
Linux and MacOS
curl -LsSf https://astral.sh/uv/install.sh | shWindows 
Open Command Prompt (search for cmd) to run the following commands.
First, check if your system has git using
git  --versionIf git isn't installed run the following command to install it
winget install Git.GitThen, install uv using
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"If that errors out, try
winget install astral-sh.uvReference
If you do not have the proper python version, install it with
uv python install 3.12If you have an Nvidia GPU, get the CUDA driver version with
nvidia-smi or nvcc --version.
Reference 
Choose the torch index based on your system and setup the environment:
uv sync --extra {cpu, cu118, cu124, cu126}For example, if you're on MacOS you'd use
uv sync --extra cpuActivate the virtual environment with
Linux and MacOS
source ./.venv/bin/activateWindows
.venv\Scripts\activateImportant
Remember to run scripts from the root directory lxmls-toolkit-student
Note
The following instructions are for developers building the toolkit.
Install the ruff linter & ty type-checker with
uv sync --extra dev To run all tests install pytest
uv sync --extra testand run
pytest -m "not gpu" -n autoRun tests that are GPU intensive with single worker using
pytest -m gpu -n 1