Ph.D. Student, Department of ECE
North Carolina State University
csavadi [at] ncsu [dot] edu
I am a third year Ph.D. student at North Carolina State University advised by Dr. Tianfu Wu. I am interested in continual learning, fine-tuning and dynamic models.
Prior to starting my Ph.D., I worked with the Precision Sustainable Agriculture initiative at NC State to build Computer Vision and Software solutions for problems in agriculture.
I worked as a Machine Learning Engineer at Persistent Systems before coming back to academia. At Persistent, I worked on Deep Learning for Medical Imaging and large scale document recognition. I also spent some time developing internal SDKs for the Data Science team, and setting up MLOps frameworks.
Outside of research, I like reading, enjoy playing Football, listening to Oldies Rock Music, and binge watching TV shows.
While LoRA inherently balances parameter, compute, and memory efficiency, many subsequent variants trade off compute and memory efficiency and/or performance to further reduce fine-tuning parameters. WeGeFT addresses this limitation, unifies PEFT and ReFT, and reduces the parameter count while maintaining performance by generating the fine-tuning weights directly from the pretrained weights. WeGeFT employs a simple low-rank formulation consisting of two linear layers, either shared across multiple layers of the pretrained model or individually learned for different layers.
We present a method to add lightweight, learnable parameters to Vision Transformers while leveraging parameter-heavy, but stable components. We show that the final linear projection layer in the multi-head self-attention (MHSA) block can be used as this light-weight module using a Mixture of Experts framework. While most of the prior methods which address this problem induce learnable parameters at every layer, or heuristically choose where to do so, we use Neural Architecture Search to determine this automatically. We use SPOS Neural Architecture Search and propose a task-similarity oriented sampling strategy to replace the uniform sampling and achieve better performance and efficiency than uniform sampling.
Website inspirations: Tejas Gokhale and Gowthami Somepalli.