Commit 1e3ddcb
ModernBERT bug fixes (#35404)
* bug fixes
* organize imports
* wrap cpu warning in reference_compile
* Avoid needing repad_logits_with_grad, always repad with grads when training
I'm not 100% that the conditional with "or labels is None" makes sense though - not sure what the intention is there. Perhaps we can remove that?
* Revert "Avoid needing repad_logits_with_grad, always repad with grads when training"
This reverts commit cedcb4e.
* Fix grammar: keep -> keeps
* Propagate grammar fix with modular_model_converter
---------
Co-authored-by: Tom Aarsen <[email protected]>
Co-authored-by: Tom Aarsen <[email protected]>1 parent e97d7a5 commit 1e3ddcb
File tree
5 files changed
+53
-19
lines changed- docs/source/en
- model_doc
- src/transformers/models/modernbert
5 files changed
+53
-19
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
505 | 505 | | |
506 | 506 | | |
507 | 507 | | |
508 | | - | |
| 508 | + | |
509 | 509 | | |
510 | 510 | | |
511 | 511 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
| 17 | + | |
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
| |||
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
30 | | - | |
| 30 | + | |
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| |||
Lines changed: 5 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
109 | 109 | | |
110 | 110 | | |
111 | 111 | | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
112 | 115 | | |
113 | 116 | | |
114 | 117 | | |
| |||
164 | 167 | | |
165 | 168 | | |
166 | 169 | | |
| 170 | + | |
167 | 171 | | |
168 | 172 | | |
169 | 173 | | |
| |||
203 | 207 | | |
204 | 208 | | |
205 | 209 | | |
| 210 | + | |
206 | 211 | | |
207 | 212 | | |
208 | 213 | | |
| |||
Lines changed: 20 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| 23 | + | |
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| |||
632 | 633 | | |
633 | 634 | | |
634 | 635 | | |
| 636 | + | |
| 637 | + | |
635 | 638 | | |
636 | 639 | | |
637 | 640 | | |
638 | 641 | | |
639 | 642 | | |
640 | | - | |
| 643 | + | |
641 | 644 | | |
642 | 645 | | |
643 | 646 | | |
| |||
647 | 650 | | |
648 | 651 | | |
649 | 652 | | |
650 | | - | |
| 653 | + | |
651 | 654 | | |
652 | 655 | | |
653 | 656 | | |
| |||
672 | 675 | | |
673 | 676 | | |
674 | 677 | | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
| 682 | + | |
| 683 | + | |
| 684 | + | |
| 685 | + | |
675 | 686 | | |
676 | 687 | | |
677 | 688 | | |
| |||
763 | 774 | | |
764 | 775 | | |
765 | 776 | | |
766 | | - | |
767 | | - | |
| 777 | + | |
| 778 | + | |
768 | 779 | | |
769 | 780 | | |
770 | 781 | | |
| |||
790 | 801 | | |
791 | 802 | | |
792 | 803 | | |
793 | | - | |
| 804 | + | |
794 | 805 | | |
795 | 806 | | |
796 | 807 | | |
| |||
805 | 816 | | |
806 | 817 | | |
807 | 818 | | |
808 | | - | |
| 819 | + | |
809 | 820 | | |
810 | 821 | | |
811 | 822 | | |
812 | | - | |
| 823 | + | |
813 | 824 | | |
814 | 825 | | |
815 | 826 | | |
| |||
1128 | 1139 | | |
1129 | 1140 | | |
1130 | 1141 | | |
1131 | | - | |
| 1142 | + | |
1132 | 1143 | | |
| 1144 | + | |
1133 | 1145 | | |
1134 | 1146 | | |
1135 | 1147 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| |||
141 | 142 | | |
142 | 143 | | |
143 | 144 | | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
144 | 148 | | |
145 | 149 | | |
146 | 150 | | |
| |||
196 | 200 | | |
197 | 201 | | |
198 | 202 | | |
| 203 | + | |
199 | 204 | | |
200 | 205 | | |
201 | 206 | | |
| |||
235 | 240 | | |
236 | 241 | | |
237 | 242 | | |
| 243 | + | |
238 | 244 | | |
239 | 245 | | |
240 | 246 | | |
| |||
857 | 863 | | |
858 | 864 | | |
859 | 865 | | |
| 866 | + | |
| 867 | + | |
860 | 868 | | |
861 | 869 | | |
862 | 870 | | |
863 | 871 | | |
864 | 872 | | |
865 | | - | |
| 873 | + | |
866 | 874 | | |
867 | 875 | | |
868 | 876 | | |
| |||
872 | 880 | | |
873 | 881 | | |
874 | 882 | | |
875 | | - | |
| 883 | + | |
876 | 884 | | |
877 | 885 | | |
878 | 886 | | |
| |||
897 | 905 | | |
898 | 906 | | |
899 | 907 | | |
| 908 | + | |
| 909 | + | |
| 910 | + | |
| 911 | + | |
| 912 | + | |
| 913 | + | |
| 914 | + | |
| 915 | + | |
900 | 916 | | |
901 | 917 | | |
902 | 918 | | |
| |||
916 | 932 | | |
917 | 933 | | |
918 | 934 | | |
919 | | - | |
920 | | - | |
| 935 | + | |
| 936 | + | |
921 | 937 | | |
922 | 938 | | |
923 | 939 | | |
| |||
943 | 959 | | |
944 | 960 | | |
945 | 961 | | |
946 | | - | |
| 962 | + | |
947 | 963 | | |
948 | 964 | | |
949 | 965 | | |
| |||
958 | 974 | | |
959 | 975 | | |
960 | 976 | | |
961 | | - | |
| 977 | + | |
962 | 978 | | |
963 | 979 | | |
964 | 980 | | |
965 | | - | |
| 981 | + | |
966 | 982 | | |
967 | 983 | | |
968 | 984 | | |
| |||
1281 | 1297 | | |
1282 | 1298 | | |
1283 | 1299 | | |
1284 | | - | |
| 1300 | + | |
1285 | 1301 | | |
| 1302 | + | |
1286 | 1303 | | |
1287 | 1304 | | |
1288 | 1305 | | |
| |||
0 commit comments