Feature request
Current implementation of VeRA does not allow to adapt layers of different shapes, as the basis matrices A and B are shared across all adapted layers and need to have compatible shapes. Solution to that would be creating the largest required A & B matrices, and slicing it accordingly for each adapted layer.
Motivation
The current constraint is significantly limiting the scope of applications for VeRA.
Your contribution
I'll do the PR.