Official repository of "Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning".
[📖 Paper] [🤗 Models] [🤗 Datasets]
conda create -n mathpuma python=3.9 -y
conda activate mathpuma
pip install -r requirements.txtThe model weights for this project are hosted on Hugging Face.
| Model | Download |
|---|---|
| Math-PUMA_Qwen2VL-1.5B | 🤗 Hugging Face |
| Math-PUMA_Qwen2VL-7B | 🤗 Hugging Face |
| Math-PUMA_DeepSeek-Math-VL-7B | 🤗 Hugging Face |
The training data used for this model is also available on Hugging Face. You can find the dataset by visiting this link.
We leverage the fine-tuning code from two repositories:
In ./train/deepseek_math/train_script.py or ./train/qwen2/train_script.py:
-
Set
USE_KLto"true", and set KL hyperparametersALPHA_KL,LAMBDA_KL, andTEMP_KL. -
Set
TRAINABLE_PARTSto"aligner, vision_tower_low, vision_tower_high". -
Set
DATA_PATH, it is worth noting that the data files must contain keysimage_url_2,instruction_2, andoutput_2. -
Run
./train/deepseek_math/train_script.pyor./train/qwen2/train_script.py.
In ./train/deepseek_math/train_script.py or ./train/qwen2/train_script.py:
-
Set
USE_KLto"false". -
Set
TRAINABLE_PARTSto"all". -
Set
DATA_PATH. -
Run
./train/deepseek_math/train_script.pyor./train/qwen2/train_script.py.
Download images of MathVerse, MathVista, and We-Math, and put them into ./eval/data/<benchmark>/images.
In ./eval/evaluate/benchmark.py:
-
Set
benchmarkto one of["mathverse", "mathvista", "wemath"]. -
To evaluate
DeepSeek-Mathbased MLLM, setmodel_typetodeepseek-vl,is_customvlmto"false", and providemodel_path; to evaluateQwen2based MLLM or other customized MLLMs, setis_customvlmto"true", and providemodel_path. -
Run
./eval/evaluate/benchmark.py.
If you find Math-PUMA useful for your research and applications, please kindly cite using this BibTeX:
@inproceedings{zhuang2025math,
title={Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning},
author={Zhuang, Wenwen and Huang, Xin and Zhang, Xiantao and Zeng, Jin},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={39},
number={24},
pages={26183--26191},
year={2025}
}