VeRA makes LoRA ~10x more parameter efficient while retaining the same performance.
It's somewhat like a recursive LoRA scheme, where the LoRA A and B matrices are also decomposed using two small trainable vector parameters.
Is the code available anywhere
VeRA makes LoRA ~10x more parameter efficient while retaining the same performance.
It's somewhat like a recursive LoRA scheme, where the LoRA A and B matrices are also decomposed using two small trainable vector parameters.