Commit dfdba50
merge ov evals (#144)
* chore: Update gpt_eval_model_name to "gpt-3.5-turbo" in mathvista.yaml
* Squashed commit of the following:
commit 994c9f97a2f8db3e9b7d7933d1e1680acde5b70b
Author: Yan Shu <[email protected]>
Date: Mon Jul 8 17:21:23 2024 +0800
Add files via upload
* Squashed commit of the following:
commit e31cd78
Author: Bo Li <[email protected]>
Date: Wed Jul 10 12:08:08 2024 +1000
chore: Update lmms_eval/models/vila.py and lmms_eval/tasks/__init__.py
commit 1d8c980
Author: kcz358 <[email protected]>
Date: Tue Jul 9 02:08:52 2024 +0000
Rename xcomposer 4KHD
commit 6da76f3
Author: Bo Li <[email protected]>
Date: Tue Jul 9 11:55:56 2024 +1000
Upgrade lmms-eval to version 0.2.1
commit cd18585
Author: Bo Li <[email protected]>
Date: Tue Jul 9 11:52:23 2024 +1000
Upgrade lmms-eval to support more models and evaluation tasks
commit 672d7e5
Author: Bo Li <[email protected]>
Date: Tue Jul 9 11:43:41 2024 +1000
feat: Add tie_weights parameter to Llava model initialization
commit 2037a86
Merge: e6844db a5c1869
Author: Bo Li <[email protected]>
Date: Tue Jul 9 11:37:12 2024 +1000
Fix gen kwargs image aspect ratio in internvl2
commit a5c1869
Merge: 2ebec77 557083a
Author: Li Bo <[email protected]>
Date: Tue Jul 9 09:15:56 2024 +0800
Merge pull request #137 from shuyansy/main
add MLVU task
commit 557083a
Author: Yan Shu <[email protected]>
Date: Mon Jul 8 16:56:50 2024 +0800
Add files via upload
commit 2ebec77
Merge: 211bfed b23d349
Author: Li Bo <[email protected]>
Date: Mon Jul 8 11:53:06 2024 +0800
Merge pull request #136 from Dousia/main
Add detailcaps
commit b23d349
Author: ByteDance <[email protected]>
Date: Sun Jul 7 23:24:19 2024 +0800
Add install capture_metric in env
commit c6e211d
Author: ByteDance <[email protected]>
Date: Sun Jul 7 23:04:13 2024 +0800
Add detailcaps
commit 211bfed
Merge: 7c208b7 79514ee
Author: Li Bo <[email protected]>
Date: Tue Jul 2 23:05:12 2024 +0800
Merge pull request #133 from EvolvingLMMs-Lab/dev/wild_vision
Add wild vision bench
commit 79514ee
Author: kcz358 <[email protected]>
Date: Mon Jul 1 15:10:02 2024 +0000
Fixing handling None filtered score
commit 725fac2
Author: kcz358 <[email protected]>
Date: Mon Jul 1 08:25:42 2024 +0000
Fixing dataset name
commit 8d963e1
Author: kcz358 <[email protected]>
Date: Mon Jul 1 08:24:51 2024 +0000
Fixing scoring logic
commit e2990d0
Author: kcz358 <[email protected]>
Date: Mon Jul 1 06:06:57 2024 +0000
Hardcode to keep image for wild vision
commit ed38173
Author: kcz358 <[email protected]>
Date: Mon Jul 1 06:06:38 2024 +0000
Add wild vision 0617
commit 7c208b7
Author: Li Bo <[email protected]>
Date: Mon Jul 1 11:53:31 2024 +0800
Update README.md
commit 39d40de
Merge: e19b43a ba7081c
Author: Li Bo <[email protected]>
Date: Mon Jul 1 11:47:09 2024 +0800
Merge pull request #129 from Dannoopsy/mmbench_ru
add task MMBench-ru
commit e19b43a
Merge: 11fd7e3 a0de897
Author: Li Bo <[email protected]>
Date: Mon Jul 1 11:46:58 2024 +0800
Merge pull request #128 from Dannoopsy/gqa-ru
add task gqa-ru
commit 11fd7e3
Merge: 383e7fe a752259
Author: Li Bo <[email protected]>
Date: Mon Jul 1 11:46:16 2024 +0800
Merge pull request #130 from lscpku/vitatecs
Add task VITATECS
commit a752259
Author: lscpku <[email protected]>
Date: Fri Jun 28 20:37:06 2024 +0800
create new task vitatecs
commit ba7081c
Author: Dannoopsy <[email protected]>
Date: Fri Jun 28 12:21:05 2024 +0300
change prompt to ru
commit 27ea9c0
Author: Dannoopsy <[email protected]>
Date: Thu Jun 27 17:17:29 2024 +0000
add mmbench_ru_dev
commit 383e7fe
Merge: 06fa000 ed2e7f7
Author: Li Bo <[email protected]>
Date: Fri Jun 28 00:14:10 2024 +0800
Merge pull request #126 from lorenzomammana/feature/external-package-integration
External package integration using plugins
commit ed2e7f7
Merge: 03947e1 06fa000
Author: Lorenzo Mammana <[email protected]>
Date: Thu Jun 27 15:38:10 2024 +0000
Merge branch 'main' into feature/external-package-integration
commit a0de897
Author: Dannoopsy <[email protected]>
Date: Tue Jun 25 11:11:37 2024 +0000
new task gqa-ru
commit 06fa000
Author: kcz358 <[email protected]>
Date: Tue Jun 25 06:41:13 2024 +0000
Fix vid mme post prompt issue
commit b388d79
Author: Li Bo <[email protected]>
Date: Sun Jun 23 22:31:16 2024 +0800
Update activitynetqa_generation.yaml
commit 8f9d620
Author: Li Bo <[email protected]>
Date: Sun Jun 23 14:02:25 2024 +0800
Update pyproject.toml
commit 6341b7c
Merge: fce85f1 903b042
Author: Li Bo <[email protected]>
Date: Sun Jun 23 14:02:02 2024 +0800
Merge pull request #125 from EvolvingLMMs-Lab/dev/interleave
[Model] aligned llava-interleave model results on video tasks
commit 903b042
Author: kcz358 <[email protected]>
Date: Sat Jun 22 12:07:13 2024 +0000
Remove unnecessary lines for video llava
commit d78ec86
Merge: ebe7217 fce85f1
Author: Li Bo <[email protected]>
Date: Sat Jun 22 13:57:31 2024 +0800
Merge branch 'main' into dev/interleave
commit ebe7217
Author: kcz358 <[email protected]>
Date: Sat Jun 22 02:57:08 2024 +0000
Delete unnecessary lines
commit 120c474
Author: kcz358 <[email protected]>
Date: Fri Jun 21 08:38:41 2024 +0000
Revise model registry for llava_hf and longva
commit 7d6201f
Author: kcz358 <[email protected]>
Date: Fri Jun 21 08:38:24 2024 +0000
Add longva
commit 12f4806
Author: kcz358 <[email protected]>
Date: Fri Jun 21 08:35:39 2024 +0000
Remove unnecessary lines since use batched visuals now in llava
commit 12cea76
Author: Bo Li <[email protected]>
Date: Thu Jun 20 18:15:32 2024 +0000
chore: Add loguru for logging in lmms_eval package
commit 03947e1
Author: Lorenzo Mammana <[email protected]>
Date: Wed Jun 5 13:40:41 2024 +0000
feat: Allow including external tasks from plugins
commit b80a91f
Author: Lorenzo Mammana <[email protected]>
Date: Wed Jun 5 13:04:55 2024 +0000
feat: Allow loading model configurations from other packages
commit 8ef2474
Author: Bo Li <[email protected]>
Date: Thu Jun 20 12:11:03 2024 +0000
chore: Remove unused models from lmms_eval package
commit af38885
Author: Bo Li <[email protected]>
Date: Thu Jun 20 12:07:09 2024 +0000
chore: Handle ImportError when importing models
Handle the ImportError exception when importing models in the lmms_eval package. This change adds a try-except block to catch the ImportError and print an error message indicating the failed import. This will help with troubleshooting and identifying any issues with the model imports.
commit fce85f1
Merge: dbe6329 d94f83c
Author: Li Bo <[email protected]>
Date: Thu Jun 20 20:02:12 2024 +0800
Merge pull request #120 from EvolvingLMMs-Lab/pufanyi/hf_dataset_docs
Add docs for datasets upload to HF
commit dbe6329
Author: choiszt <[email protected]>
Date: Thu Jun 20 15:14:21 2024 +0800
update ablation for videomme datasets
commit d94f83c
Author: Li Bo <[email protected]>
Date: Thu Jun 20 13:30:59 2024 +0800
Update README.md
commit cab8159
Author: Li Bo <[email protected]>
Date: Thu Jun 20 13:30:29 2024 +0800
Update README.md
commit 4587665
Author: kcz358 <[email protected]>
Date: Thu Jun 20 03:55:30 2024 +0000
Add llava_hf back to registry
commit 3463651
Author: kcz358 <[email protected]>
Date: Thu Jun 20 03:54:33 2024 +0000
Remove handling non-visual loop in llava
commit cb0d3f4
Author: Fanyi Pu <[email protected]>
Date: Thu Jun 20 02:11:18 2024 +0800
update readme
commit 813877b
Author: Fanyi Pu <[email protected]>
Date: Wed Jun 19 15:37:52 2024 +0800
to sh script
commit a14684b
Author: Fanyi Pu <[email protected]>
Date: Wed Jun 19 15:37:04 2024 +0800
lint
commit d0f8851
Author: Fanyi Pu <[email protected]>
Date: Wed Jun 19 15:36:48 2024 +0800
small fix
commit 63748e9
Author: Fanyi Pu <[email protected]>
Date: Wed Jun 19 15:36:43 2024 +0800
small fix
commit 7f1159a
Author: Fanyi Pu <[email protected]>
Date: Wed Jun 19 15:35:05 2024 +0800
update preparation
commit 19f9bd6
Author: Fanyi Pu <[email protected]>
Date: Wed Jun 19 15:23:24 2024 +0800
docs
commit ce6f889
Author: Fanyi Pu <[email protected]>
Date: Wed Jun 19 15:04:16 2024 +0800
tutorial
commit f513c52
Author: Bo Li <[email protected]>
Date: Wed Jun 19 06:51:19 2024 +0000
chore: Update dependencies to fix potential risks and improve compatibility
commit efb5295
Author: kcz358 <[email protected]>
Date: Wed Jun 19 10:25:58 2024 +0800
Release llava-wilder
commit 742651f
Author: Fanyi Pu <[email protected]>
Date: Wed Jun 19 07:44:26 2024 +0800
feat: Add support for auto downloading tar format videos
commit 511b625
Merge: 22a4958 050b2c3
Author: Bo Li <[email protected]>
Date: Tue Jun 18 17:01:03 2024 +0000
Merge branch 'main' of https://github.com/EvolvingLMMs-Lab/lmms-eval
commit 050b2c3
Merge: 74facb4 ef30651
Author: Li Bo <[email protected]>
Date: Tue Jun 18 13:13:38 2024 +0800
Merge pull request #114 from zjysteven/add-tinyllava
add tinyllava
commit ef30651
Author: Jingyang Zhang <[email protected]>
Date: Mon Jun 17 17:57:02 2024 -0400
fix typo
commit 9bab677
Merge: dbfb238 74facb4
Author: Jingyang Zhang <[email protected]>
Date: Sun Jun 16 10:56:05 2024 -0400
Merge branch 'EvolvingLMMs-Lab:main' into add-tinyllava
commit 74facb4
Merge: 8ba192f d5df72d
Author: Li Bo <[email protected]>
Date: Sun Jun 16 17:59:19 2024 +0800
Merge pull request #118 from teowu/main
Fix the potential risk by PR #117
commit d5df72d
Merge: 5bf59ed 8ba192f
Author: Teo (Timothy) Wu Haoning <[email protected]>
Date: Sun Jun 16 15:32:13 2024 +0800
Merge branch 'EvolvingLMMs-Lab:main' into main
commit 5bf59ed
Author: teowu <[email protected]>
Date: Sun Jun 16 07:27:28 2024 +0000
fix #117, allow auto download with tar format videos
commit 98b3955
Merge: a056f11 be9dada
Author: teowu <[email protected]>
Date: Sun Jun 16 07:25:07 2024 +0000
Merge branch 'main' of https://github.com/teowu/lmms-eval into main
commit a056f11
Author: teowu <[email protected]>
Date: Sun Jun 16 07:23:54 2024 +0000
fix #117, allow auto download with tar format videos
commit 8ba192f
Merge: 7cc2890 be9dada
Author: Li Bo <[email protected]>
Date: Sat Jun 15 17:30:59 2024 +0800
Merge pull request #117 from teowu/main
LongVideoBench for LMMs-Eval
commit be9dada
Merge: 62ea8ce 7cc2890
Author: Teo (Timothy) Wu Haoning <[email protected]>
Date: Sat Jun 15 16:39:20 2024 +0800
Merge pull request #1 from EvolvingLMMs-Lab/main
Merge pull request #113 from teowu/main
commit 62ea8ce
Author: teowu <[email protected]>
Date: Sat Jun 15 08:30:11 2024 +0000
LongVideoBench support: image LMMs (idefics2, phi3) and video LMMs (LLaVA-Next-Video-34B)
commit 7cc2890
Merge: 4bc7224 ea14cd4
Author: Li Bo <[email protected]>
Date: Sat Jun 15 14:10:22 2024 +0800
Merge pull request #113 from teowu/main
Q-Bench, Q-Bench2, A-Bench
commit dbfb238
Author: Jingyang <[email protected]>
Date: Fri Jun 14 16:20:42 2024 -0400
add tinyllava
commit ea14cd4
Author: teowu <[email protected]>
Date: Fri Jun 14 15:01:52 2024 +0000
Add qbench, qbench2, abench; fix phi3v as its current implementation does not support multi-image
commit 4bc7224
Merge: 2797987 bf14cb8
Author: Li Bo <[email protected]>
Date: Fri Jun 14 02:14:43 2024 +0800
Merge pull request #111 from XinrunDu/main
add II-Bench
commit bf14cb8
Author: XinrunDu <[email protected]>
Date: Thu Jun 13 09:37:02 2024 +0000
fix dataset_path
commit 6248113
Author: XinrunDu <[email protected]>
Date: Thu Jun 13 09:32:06 2024 +0000
add II-Bench
commit 2797987
Merge: 63d82f1 66d4bb2
Author: Li Bo <[email protected]>
Date: Thu Jun 13 11:14:47 2024 +0800
Merge pull request #109 from EvolvingLMMs-Lab/pufanyi/update_version
[Small Update] Update the version of LMMs-Eval
commit 66d4bb2
Author: Fanyi Pu <[email protected]>
Date: Thu Jun 13 11:13:00 2024 +0800
update version
commit 63d82f1
Author: Li Bo <[email protected]>
Date: Thu Jun 13 11:04:32 2024 +0800
Update README.md
commit 44a3379
Merge: 5ed0035 0ce46d0
Author: Li Bo <[email protected]>
Date: Thu Jun 13 04:00:12 2024 +0800
Merge pull request #105 from tianyu-z/main
Include VCR
commit 0ce46d0
Author: Suyuchen <[email protected]>
Date: Wed Jun 12 15:56:34 2024 -0400
update README.md
commit 46a88d8
Merge: 47b13b9 5ed0035
Author: Suyuchen <[email protected]>
Date: Wed Jun 12 15:50:26 2024 -0400
merged readme.md
commit 47b13b9
Author: Suyuchen <[email protected]>
Date: Wed Jun 12 15:30:52 2024 -0400
update aggregation function for vcr_wiki
commit 5ed0035
Author: Li Bo <[email protected]>
Date: Thu Jun 13 03:21:42 2024 +0800
Update README.md
commit ed88068
Author: Li Bo <[email protected]>
Date: Thu Jun 13 03:13:59 2024 +0800
Update README.md
commit fea3806
Merge: d99a24a 05dc8e8
Author: Li Bo <[email protected]>
Date: Thu Jun 13 03:11:49 2024 +0800
Merge pull request #108 from EvolvingLMMs-Lab/internal_main_dev
[Upgrade to v0.2] Embracing Video Evaluations with LMMs-Eval
commit 05dc8e8
Author: Bo Li <[email protected]>
Date: Wed Jun 12 15:56:04 2024 +0000
chore: Update lmms-eval to support video evaluations for LLaVA models
commit cbeee20
Author: Bo Li <[email protected]>
Date: Wed Jun 12 15:50:30 2024 +0000
chore: Update lmms-eval to support video evaluations for LLaVA models
commit f00d549
Author: Bo Li <[email protected]>
Date: Wed Jun 12 15:46:33 2024 +0000
Update image alignment in README.md
commit 3415633
Author: Bo Li <[email protected]>
Date: Wed Jun 12 15:43:16 2024 +0000
Update llava conv_template in lmms_eval/models/llava.py
commit 50575a9
Author: Bo Li <[email protected]>
Date: Wed Jun 12 15:39:03 2024 +0000
chore: Update lmms-eval to support video evaluations for LLaVA models
commit c9b2252
Author: Bo Li <[email protected]>
Date: Wed Jun 12 15:33:48 2024 +0000
Bump version to 0.2.0.dev0
commit 465bd42
Merge: e43bd84 d99a24a
Author: Bo Li <[email protected]>
Date: Wed Jun 12 15:04:25 2024 +0000
Merge branch 'main' of https://github.com/EvolvingLMMs-Lab/lmms-eval into internal_main_dev
commit e43bd84
Author: Bo Li <[email protected]>
Date: Wed Jun 12 14:54:06 2024 +0000
chore: Remove unnecessary files and code related to live_bench and sft_eval tasks
commit d99a24a
Merge: 374590b a66003b
Author: Li Bo <[email protected]>
Date: Wed Jun 12 19:45:57 2024 +0800
Merge pull request #107 from AtsuMiyai/new_task/upd_update
update gpt-3.5-turbo version
commit a66003b
Author: AtsuMiyai <[email protected]>
Date: Wed Jun 12 17:05:17 2024 +0900
update gpt-3.5-turbo version
commit ee91f27
Author: AtsuMiyai <[email protected]>
Date: Wed Jun 12 16:50:53 2024 +0900
update gpt-3.5-turbo version
commit 326b969
Author: tianyu-z <[email protected]>
Date: Mon Jun 10 20:07:40 2024 -0400
include std and confidence interval
commit cd050d4
Author: Suyuchen <[email protected]>
Date: Mon Jun 10 18:49:47 2024 -0400
update vcr_wiki tasks in README.md
commit 205721e
Author: Suyuchen <[email protected]>
Date: Mon Jun 10 18:43:15 2024 -0400
update vcr_wiki tasks
commit db8e718
Author: tianyu-z <[email protected]>
Date: Mon Jun 10 16:13:58 2024 -0400
include the try-except logic for spacy
commit 427dabb
Author: Suyuchen <[email protected]>
Date: Mon Jun 10 15:51:05 2024 -0400
add crossed_text to vcr_wiki output
commit 043b483
Author: tianyu-z <[email protected]>
Date: Mon Jun 10 15:47:00 2024 -0400
switch logic
commit e1f04db
Author: tianyu-z <[email protected]>
Date: Mon Jun 10 02:38:21 2024 -0400
modify the form of VCR
commit 96e8d98
Author: tianyu-z <[email protected]>
Date: Mon Jun 10 00:10:30 2024 -0400
init include vcr
commit 374590b
Merge: 504685e cb3b9ce
Author: Kaichen Zhang - NTU <[email protected]>
Date: Fri Jun 7 20:25:48 2024 +0800
Merge pull request #101 from Gumpest/main
Update conbench in README
commit 504685e
Author: Li Bo <[email protected]>
Date: Thu Jun 6 15:42:15 2024 +0800
Update README.md
commit cb3b9ce
Merge: c9793b3 67b64ea
Author: Yuan Zhang <[email protected]>
Date: Thu Jun 6 11:22:24 2024 +0800
Merge branch 'EvolvingLMMs-Lab:main' into main
commit c9793b3
Author: Yuan Zhang <[email protected]>
Date: Thu Jun 6 11:21:05 2024 +0800
update README
commit 67b64ea
Merge: 8ee7848 5fd6845
Author: Li Bo <[email protected]>
Date: Wed Jun 5 23:12:58 2024 +0800
Merge pull request #100 from Gumpest/main
add Conbench
commit 5fd6845
Author: Yuan Zhang <[email protected]>
Date: Wed Jun 5 21:52:31 2024 +0800
add conbench
commit 8ee7848
Merge: 747e197 6fefaf7
Author: Li Bo <[email protected]>
Date: Tue Jun 4 17:09:33 2024 +0800
Merge pull request #95 from AtsuMiyai/new_task/upd
add MM-UPD
commit 747e197
Merge: 4854a34 0584307
Author: Li Bo <[email protected]>
Date: Tue Jun 4 17:09:04 2024 +0800
Merge pull request #97 from CaraJ7/update
Add MathVerse in README.md
commit 6fefaf7
Author: AtsuMiyai <[email protected]>
Date: Tue Jun 4 17:36:39 2024 +0900
update utils.py for leaderboard submission
commit 5f4fe36
Author: AtsuMiyai <[email protected]>
Date: Sun Jun 2 23:28:27 2024 +0900
slightly change query_prompt for the reproduction
commit 0584307
Author: CaraJ7 <[email protected]>
Date: Sun Jun 2 17:05:28 2024 +0800
Add MathVerse in README.md
commit 0581ab3
Author: AtsuMiyai <[email protected]>
Date: Fri May 31 16:09:45 2024 +0900
merge model_specific_prompt_kwargs and dataset_name into each task yaml
commit 4854a34
Author: Pu Fanyi <[email protected]>
Date: Sat May 4 19:23:39 2024 +0800
Group MMMU images into one image (#83)
* update
* update font
* Add matplotlib.font_manager import in utils.py
* Refactor font handling in add_order_label function in utils.py
* group mmmu
---------
Co-authored-by: Li Bo <[email protected]>
commit d224794
Author: AtsuMiyai <[email protected]>
Date: Wed May 29 15:15:59 2024 +0900
add upd
commit 453e793
Author: AtsuMiyai <[email protected]>
Date: Wed May 29 15:03:30 2024 +0900
add upd
commit 909edd6
Author: AtsuMiyai <[email protected]>
Date: Wed May 29 12:52:21 2024 +0900
add upd
commit 7c1ac97
Author: AtsuMiyai <[email protected]>
Date: Wed May 29 12:50:32 2024 +0900
add upd
commit 811301c
Author: AtsuMiyai <[email protected]>
Date: Wed May 29 12:46:58 2024 +0900
add upd
commit 71401ba
Author: AtsuMiyai <[email protected]>
Date: Wed May 29 12:41:21 2024 +0900
add upd
commit 24dc435
Author: Bo Li <[email protected]>
Date: Mon May 27 10:17:32 2024 +0000
fix compatibility issue of older version llava
commit 616edf4
Author: Bo Li <[email protected]>
Date: Mon May 27 09:32:26 2024 +0000
[Fix] import issues of multilingual llava and olympiadbench
commit 4c5a99e
Merge: 45c05b2 b05c3e2
Author: Li Bo <[email protected]>
Date: Mon May 27 14:19:53 2024 +0800
Merge pull request #87 from vfragoso/vifragos/phi3v
Adding microsoft/Phi-3-vision-128k-instruct model.
commit b05c3e2
Author: Victor Fragoso <[email protected]>
Date: Fri May 24 16:36:37 2024 +0000
Adding documentation of Phi3v class.
commit c200897
Author: Victor Fragoso <[email protected]>
Date: Fri May 24 16:25:02 2024 +0000
Adding prompt arguments for Phi3v on MathVista-TestMini
commit 7f9fb6b
Author: Victor Fragoso <[email protected]>
Date: Fri May 24 13:24:16 2024 +0000
Adding Phi3v model.
commit 45c05b2
Author: kcz358 <[email protected]>
Date: Thu May 23 03:47:36 2024 +0000
Set printing info for llava_hf to debug level
commit 53f013e
Author: kcz358 <[email protected]>
Date: Thu May 23 03:41:39 2024 +0000
Fix pope random name in pope full
commit 22520a9
Author: kcz358 <[email protected]>
Date: Thu May 23 03:41:14 2024 +0000
Add separated pope tasks by category
commit d1eefb1
Author: kcz358 <[email protected]>
Date: Thu May 9 08:36:02 2024 +0000
Update gitignore
commit b2b4dbd
Author: kcz358 <[email protected]>
Date: Mon May 20 07:45:11 2024 +0000
Comment out Spice in caption task so that don't need to download stanford nlp model
commit 662f05c
Author: kcz358 <[email protected]>
Date: Mon May 20 03:13:13 2024 +0000
Comment out parse result in xcomposer
commit 0932932
Author: kcz358 <[email protected]>
Date: Thu May 16 03:55:39 2024 +0000
Fix instructblip qformer size mismatch and multi-images problem
commit 557a6a3
Author: kcz358 <[email protected]>
Date: Thu May 16 03:11:41 2024 +0000
Remove redundant code in fuyu
commit 6aeb550
Author: kcz358 <[email protected]>
Date: Thu May 16 01:45:24 2024 +0000
Fix idefics2 llava in the wild bugs
commit aea80e6
Author: kcz358 <[email protected]>
Date: Wed May 15 11:07:35 2024 +0000
Better task list_with_num
commit 3c12a08
Author: Li Bo <[email protected]>
Date: Sat May 18 02:35:52 2024 +0800
Update LICENSE
commit 82317a6
Author: Li Bo <[email protected]>
Date: Sat May 18 02:29:09 2024 +0800
Update LICENSE
commit a8bba1c
Author: Li Bo <[email protected]>
Date: Sat May 18 02:28:03 2024 +0800
Create LICENSE
commit caa5893
Merge: c094448 423b006
Author: Li Bo <[email protected]>
Date: Mon May 13 11:45:26 2024 +0800
Merge pull request #73 from EvolvingLMMs-Lab/kc/qwen_vl_api
[Feat] Add qwen vl api
commit c094448
Author: kcz358 <[email protected]>
Date: Sat May 11 06:11:19 2024 +0000
Fix llava_hf image tokens number issue
commit 64f07e4
Author: kcz358 <[email protected]>
Date: Thu May 9 02:04:10 2024 +0000
Fix endless warning for llava_hf generation
commit 8aaa828
Author: Bo Li <[email protected]>
Date: Thu May 2 06:13:56 2024 +0000
Add model_name parameter to Llava constructor
commit 7847dc4
Author: kcz358 <[email protected]>
Date: Tue May 7 03:15:59 2024 +0000
Parse result for llava_hf 1.6
commit 3e56b4f
Author: kcz358 <[email protected]>
Date: Tue May 7 03:09:56 2024 +0000
Fix llava_hf generation for 1.6
commit fa3ff92
Author: kcz358 <[email protected]>
Date: Mon May 6 08:32:57 2024 +0000
Fix llava conv template for llama3
commit 423b006
Author: kcz358 <[email protected]>
Date: Sun May 5 07:54:52 2024 +0000
Add qwen vl api
commit b7fd7a9
Merge: 986139a c5a130b
Author: Li Bo <[email protected]>
Date: Sun May 5 13:19:48 2024 +0800
Merge pull request #59 from EvolvingLMMs-Lab/add_idefics2
add idefics2
commit 986139a
Merge: b46239c 8d3526c
Author: Li Bo <[email protected]>
Date: Fri May 3 01:18:18 2024 +0800
Merge pull request #36 from cocoshe/main
[Fix] repr llava doc
commit b46239c
Merge: bc69a74 373265f
Author: Li Bo <[email protected]>
Date: Fri May 3 01:17:34 2024 +0800
Merge pull request #56 from gagan3012/main
Multilingual LLava bench
commit bc69a74
Merge: eef3aeb 626e8a9
Author: Li Bo <[email protected]>
Date: Fri May 3 01:12:14 2024 +0800
Merge pull request #70 from hunterheiden/hsh/new_task/WebSRC
Bugfix: WebSRC should be token-level F1 NOT character-level
commit 626e8a9
Author: Hunter Heidenreich <[email protected]>
Date: Thu May 2 09:31:03 2024 -0400
Bugfix: WebSRC should be token-level F1 NOT character-level
commit eef3aeb
Merge: c4e9dd9 9bca441
Author: Li Bo <[email protected]>
Date: Thu May 2 14:38:17 2024 +0800
Merge pull request #69 from hunterheiden/hsh/new_task/WebSRC
[New Task] WebSRC (multimodal Q&A on web screenshots)
commit 9bca441
Author: Hunter Heidenreich <[email protected]>
Date: Wed May 1 11:07:29 2024 -0400
Add code to enable compilation of submission for WebSRC test split
commit 7687495
Author: Hunter Heidenreich <[email protected]>
Date: Wed May 1 10:47:32 2024 -0400
Draft and validate websrc eval on dev split
commit 4eebd3e
Author: Hunter Heidenreich <[email protected]>
Date: Wed May 1 10:46:54 2024 -0400
Update main README with new task names
commit 35fe80b
Author: Hunter Heidenreich <[email protected]>
Date: Wed May 1 10:46:20 2024 -0400
Draft README for WebSRC
commit 955bd06
Author: Hunter Heidenreich <[email protected]>
Date: Tue Apr 30 10:16:21 2024 -0400
Init webSRC
commit c4e9dd9
Merge: d8a3a99 319afcc
Author: Li Bo <[email protected]>
Date: Fri Apr 26 14:37:22 2024 +0800
Merge pull request #63 from hunterheiden/hsh/new_task/screenspot
New Task: ScreenSpot - Grounding (REC) and instruction generation (REG) on screens
commit 319afcc
Author: Hunter Heidenreich <[email protected]>
Date: Thu Apr 25 11:44:34 2024 -0400
slight update
commit 2f3811c
Author: Hunter Heidenreich <[email protected]>
Date: Thu Apr 25 11:41:04 2024 -0400
Add README file specific to ScreenSpot
commit 28962cb
Author: Hunter Heidenreich <[email protected]>
Date: Wed Apr 24 11:52:33 2024 -0400
Update README to reflect new tasks
commit e457cfb
Author: Hunter Heidenreich <[email protected]>
Date: Tue Apr 23 18:33:16 2024 -0400
Create ScreenSpot on clean branch
commit d8a3a99
Merge: 3dcd015 ed17129
Author: Li Bo <[email protected]>
Date: Tue Apr 23 10:34:03 2024 +0800
Merge pull request #61 from tupini07/patch-1
Fix typo in Qwen-VL that was causing "reference before assignment"
commit ed17129
Author: Andrea Tupini <[email protected]>
Date: Mon Apr 22 14:56:41 2024 -0600
refactor query construction for clarity
commit cd87420
Author: Andrea Tupini <[email protected]>
Date: Mon Apr 22 14:54:29 2024 -0600
convert contexts to list if necessary and remove unnecessary construction of `questions`
commit 8557367
Author: Andrea Tupini <[email protected]>
Date: Mon Apr 22 14:47:33 2024 -0600
Fix typo in qwen_vl that was causing "reference before assignment"
commit 3dcd015
Merge: 95df9fe 743673a
Author: Li Bo <[email protected]>
Date: Sat Apr 20 22:03:16 2024 +0800
Merge pull request #60 from CaraJ7/main
Add MathVerse
commit 743673a
Merge: c1a5472 95df9fe
Author: CaraJ7 <[email protected]>
Date: Sat Apr 20 21:49:02 2024 +0800
Merge branch 'main' of https://github.com/EvolvingLMMs-Lab/lmms-eval
commit c1a5472
Author: CaraJ7 <[email protected]>
Date: Sat Apr 20 21:45:34 2024 +0800
Add MathVerse
commit 373265f
Author: Gagan Bhatia <[email protected]>
Date: Fri Apr 12 17:21:39 2024 -0700
Add files via upload
commit d853051
Author: Gagan Bhatia <[email protected]>
Date: Fri Apr 12 17:19:49 2024 -0700
Create README.md
commit 22a4958
Author: Bo Li <[email protected]>
Date: Thu Apr 4 17:12:43 2024 +0000
[WIP] adding mmbench dev evaluation (#75)
* WIP
* Update GPT evaluation model name and sys prompt
* 🛠️ Scale accuracy to percentage
The accuracy value is now multiplied by 100 in the aggregation function to represent it as a percentage. Regarding the evaluation process, `math` module importation and refactoring reduce progress log verbosity by logging every 100 evaluations instead of 10. It prevents potential logging overflow. Handling of NaN values is added to ensure 'default_value' is set in case of missing data, avoiding errors in split, category, and l2-category assignments. Finally, reporting of categorical and l2-categorical accuracies is streamlined through a new `calculate_hit_rates` function, improving code readability and maintenance.
Issue refs: #1427, #1533
* Update GPT evaluation model name and API configuration
* Refactor MMBench_Evaluator class to handle missing columns
* Add print statements for detailed results in MMBench-CN(CC), MMBench-CN(Dev), and MMBench-EN(Dev) evaluations
* Refactor MMBench-CN and MMBench-EN evaluation functions
* 🔄 Refactor result processing and logging logic
- Simplified the result processing functions across different utility modules (`cc_utils.py`, `cn_utils.py`, `en_utils.py`) to unify the handling of multiple-choice options. Now, all options ("A" to "E") are dynamically added to the result data, and default to "nan" if not provided in the document.
- Removed redundant keys directly from the process results dict creation to avoid clutter and align with the new dynamic addition of options.
- In `mmbench_evals.py`, removed the unnecessary check for all splits being 'dev' and streamlined the evaluation loop by eliminating the progress bar (tqdm) for a cleaner log output.
- Commented-out code and verbose logging during evaluation, which may have interfered with performance, has been removed for a more efficient and less intrusive logging experience.
This cleanup reduces redundancy in the codebase and improves evaluation performance.
Refs #2045
---------
Co-authored-by: Bo Li <[email protected]>
(cherry picked from commit a19278c)
commit 8d3526c
Author: cocoshe <[email protected]>
Date: Thu Mar 28 13:38:36 2024 +0800
fix doc
* feat: Add LlavaOneVision model to available models
chore: Update sqlitedict dependency to version 2.1.0
* Revert "Squashed commit of the following:"
This reverts commit 11b00999df3c43cb225482e030b791b2d454124c.
* Refactor available models in lmms_eval
Remove duplicate entries for "llava_hf", "llava_onevision", and "longva" in the AVAILABLE_MODELS dictionary in lmms_eval/models/__init__.py.
* fix: Handle import errors in lmms_eval models/__init__.py
The code changes in this commit fix the handling of import errors in the lmms_eval/models/__init__.py file. Previously, when an import error occurred, the code simply ignored it. This commit updates the code to log an error message using the logger module when an import error occurs.
This commit also removes duplicate entries for "llava_hf", "llava_onevision", and "longva" in the AVAILABLE_MODELS dictionary.
Recent user commits:
- Refactor available models in lmms_eval
- Revert "Squashed commit of the following:"
- feat: Add LlavaOneVision model to available models
- chore: Update sqlitedict dependency to version 2.1.0
* fix: Handle import errors in lmms_eval models/__init__.py
* chore: Remove unused imports in lmms_eval/models/__init__.py and lmms_eval/tasks/vcr_wiki/utils.py
* Remove unused imports in lmms_eval/tasks/vcr_wiki/utils.py
* chore: Update lmms_eval/tasks/vcr_wiki/utils.py
This commit updates the `lmms_eval/tasks/vcr_wiki/utils.py` file. It removes unused imports and fixes the condition for loading Spacy models based on the `load_package` value in the config file. Additionally, it adds a debug log message when the Spacy models are not loaded due to `load_package` being set to False.
Remove unused imports in `lmms_eval/tasks/vcr_wiki/utils.py`
* feat: Add new subtasks to overall score calculation
The code changes in this commit add new subtasks to the overall score calculation in the `overall_score` function. The subtasks "ScanQA", "BLINK", "MathVerse", "SciVerse", and "Mantis" are included in the `categories` dictionary. This ensures that the scores for these subtasks are calculated and included in the evaluation results.
Remove unused imports and update subtask categories in `utils.py`
* feat: Add new subtasks to overall score calculation
* chore: Update lmms_eval/tasks/llava_interleave_bench/_default_template_interleave_yaml
Update the image aspect ratio in the default template for the llava_interleave_bench task. Change the value of "image_aspect_ratio" from "original" to "pad". This ensures that the generated images have a padded aspect ratio.
* if no response directly return 0
* Squashed commit of the following:
commit b2a009b
Author: Pu Fanyi <[email protected]>
Date: Mon Jul 15 19:12:25 2024 -0700
if no response directly return 0 (#142)
commit 5fc5f2f
Author: Kaichen Zhang - NTU <[email protected]>
Date: Tue Jul 16 10:12:11 2024 +0800
Add Muirbench (#143)
* handle gen kwargs in internvl2
* Add muirbench
* Add files via upload
(cherry picked from commit 557083a)
* update
---------
Co-authored-by: Fanyi Pu <[email protected]>
Co-authored-by: Yan Shu <[email protected]>1 parent b2a009b commit dfdba50
File tree
21 files changed
+328
-107
lines changed- lmms_eval
- models
- tasks
- llava_interleave_bench
- mathvista
- mlvu
- muirbench
- olympiadbench
- vcr_wiki
- vibe_eval
- wild_vision_bench
21 files changed
+328
-107
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
6 | 9 | | |
7 | 10 | | |
8 | 11 | | |
| |||
33 | 36 | | |
34 | 37 | | |
35 | 38 | | |
36 | | - | |
37 | | - | |
38 | | - | |
| 39 | + | |
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
42 | 43 | | |
43 | 44 | | |
44 | 45 | | |
45 | 46 | | |
46 | | - | |
47 | | - | |
48 | | - | |
| 47 | + | |
| 48 | + | |
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
| 57 | + | |
| 58 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
213 | 213 | | |
214 | 214 | | |
215 | 215 | | |
216 | | - | |
| 216 | + | |
217 | 217 | | |
218 | 218 | | |
219 | 219 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | 57 | | |
67 | 58 | | |
68 | 59 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
36 | | - | |
37 | | - | |
38 | | - | |
| 36 | + | |
39 | 37 | | |
40 | 38 | | |
41 | 39 | | |
| |||
81 | 79 | | |
82 | 80 | | |
83 | 81 | | |
84 | | - | |
85 | 82 | | |
86 | 83 | | |
87 | 84 | | |
| |||
202 | 199 | | |
203 | 200 | | |
204 | 201 | | |
205 | | - | |
206 | 202 | | |
207 | 203 | | |
208 | 204 | | |
| |||
278 | 274 | | |
279 | 275 | | |
280 | 276 | | |
281 | | - | |
282 | | - | |
283 | | - | |
284 | | - | |
285 | | - | |
286 | 277 | | |
287 | 278 | | |
288 | 279 | | |
289 | 280 | | |
290 | 281 | | |
291 | 282 | | |
292 | | - | |
293 | 283 | | |
294 | 284 | | |
295 | 285 | | |
296 | 286 | | |
297 | 287 | | |
298 | 288 | | |
299 | | - | |
300 | | - | |
301 | 289 | | |
302 | 290 | | |
303 | 291 | | |
304 | 292 | | |
305 | | - | |
| 293 | + | |
306 | 294 | | |
307 | 295 | | |
308 | 296 | | |
| |||
370 | 358 | | |
371 | 359 | | |
372 | 360 | | |
373 | | - | |
374 | 361 | | |
375 | 362 | | |
376 | 363 | | |
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
25 | 35 | | |
26 | 36 | | |
27 | 37 | | |
| |||
184 | 194 | | |
185 | 195 | | |
186 | 196 | | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
187 | 206 | | |
188 | 207 | | |
189 | 208 | | |
| 209 | + | |
190 | 210 | | |
191 | 211 | | |
192 | 212 | | |
| |||
196 | 216 | | |
197 | 217 | | |
198 | 218 | | |
| 219 | + | |
199 | 220 | | |
200 | 221 | | |
201 | 222 | | |
202 | 223 | | |
203 | 224 | | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
204 | 228 | | |
205 | 229 | | |
206 | 230 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
12 | | - | |
13 | 11 | | |
14 | 12 | | |
15 | 13 | | |
| |||
23 | 21 | | |
24 | 22 | | |
25 | 23 | | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
| 24 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
2 | | - | |
3 | | - | |
4 | | - | |
5 | | - | |
6 | | - | |
7 | | - | |
8 | | - | |
9 | | - | |
10 | | - | |
11 | | - | |
12 | | - | |
13 | | - | |
14 | | - | |
15 | | - | |
16 | | - | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
30 | | - | |
31 | | - | |
32 | | - | |
33 | | - | |
34 | | - | |
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
0 commit comments