Portrait

Chinmay Savadikar

Ph.D. Student, Department of ECE

North Carolina State University

csavadi [at] ncsu [dot] edu

I am a third year Ph.D. student at North Carolina State University advised by Dr. Tianfu Wu. I am interested in continual learning, fine-tuning and dynamic models.

Prior to starting my Ph.D., I worked with the Precision Sustainable Agriculture initiative at NC State to build Computer Vision and Software solutions for problems in agriculture.

I worked as a Machine Learning Engineer at Persistent Systems before coming back to academia. At Persistent, I worked on Deep Learning for Medical Imaging and large scale document recognition. I also spent some time developing internal SDKs for the Data Science team, and setting up MLOps frameworks.

Outside of research, I like reading, enjoy playing Football, listening to Oldies Rock Music, and binge watching TV shows.

Publications

WeGeFT: Weight‑Generative Fine‑Tuning for Multi‑Faceted Efficient Adaptation of Large Models
ICML 2025
Chinmay Savadikar, Xi Song, Tianfu Wu
pdf web code

While LoRA inherently balances parameter, compute, and memory efficiency, many subsequent variants trade off compute and memory efficiency and/or performance to further reduce fine-tuning parameters. WeGeFT addresses this limitation, unifies PEFT and ReFT, and reduces the parameter count while maintaining performance by generating the fine-tuning weights directly from the pretrained weights. WeGeFT employs a simple low-rank formulation consisting of two linear layers, either shared across multiple layers of the pretrained model or individually learned for different layers.

Continual Learning via Learning a Continual Memory in Vision Transformer
Preprint
Chinmay Savadikar, Michelle Dai, Tianfu Wu
pdf

We present a method to add lightweight, learnable parameters to Vision Transformers while leveraging parameter-heavy, but stable components. We show that the final linear projection layer in the multi-head self-attention (MHSA) block can be used as this light-weight module using a Mixture of Experts framework. While most of the prior methods which address this problem induce learnable parameters at every layer, or heuristically choose where to do so, we use Neural Architecture Search to determine this automatically. We use SPOS Neural Architecture Search and propose a task-similarity oriented sampling strategy to replace the uniform sampling and achieve better performance and efficiency than uniform sampling.

Website inspirations: Tejas Gokhale and Gowthami Somepalli.