About
Hi, I am Samyadeep Basu, a 3rd year CS PhD student at UMD, College Park (2022 - Present). I work with Soheil Feizi in the Center for Machine Learning. My research focus is reliable deep learning, where I build methods to understand models through the lens of data and control both generative and discriminative deep models using light-weight tweaks (aka model editing) or light-weight fine-tuning. Previously I received my MS from UMD in 2020 and then spent close to two years at Microsoft AI in the ML rotation program. During my stint at Microsoft AI, I worked with the Language Science Team at Azure AI and MSAI where I researched, developed and deployed large-scale language models for various scenarios.
My latest CV can be accessed at: CV.
News
(September 2024): 2 papers on mechanistic interpretability for vision models and MLLMs accepted to NeurIPS 2024! 2 papers on VLM compositionality, VLM prompt tuning accepted to EMNLP 2024 (Main)!
(August 2024): Finished an exciting internship at Adobe. Also excited to pass my PhD preliminary examination!
(May 2024): Submitted multiple papers to NeurIPS on: (i) Multimodal VQA models, (ii) compositionality in t2i models, (iii) internal mechanistic understanding of ViTs and (iv) a new framework for copyright attribution in t2i models!
(April 2024): Excited to be rejoining Adobe Research for a summer internship to work on small language models!; Paper on diffusion model editing accepted to ICML 2024!
(Jan 2024 - May 2024): Started an internship at Microsoft Research working on interpretability for multimodal language models!
(January 2024) : Paper on diffusion model interpretability accepted to ICLR 2024!
(December 2023) : Paper on few-shot finetuning for vision models accepted to AAAI 2024!
(August-October 2023): Paper on surgical fine-tuning for “small” language models accepted to EMNLP 2023!
(July 2023): Excited to announce two new projects : (i) Improving CLIP using knowledge distillation from diffusion models; (ii) Benchmark for text-guided image editing methods!
(June 2023): Started internship at Adobe Research! Working on interpretability + model editing for text-to-image generative models!
(Jan 2023): Paper on algorithm to design difficult few-shot tasks accepted at ICLR 2023!
(September 2022): Finished an amazing research internship at Microsoft Research working with Daniela Massiceti on few-shot learning!
(Feb 2022): Started my PhD to work on few-shot learning and model interpretability!
(October 2020 - Jan 2022) : Break from Grad School to work at Microsoft AI as an Applied Scientist!
(Feb 2021): Papers on influence functions at ICLR 2021 and ICML 2020!
Recent Preprints
Decomposing and Interpreting Image Representations via Text in ViTs Beyond CLIP
NeurIPS 2024
We interpret different components of vision transformers using text!
Short version appearing at ICML 2024 mechanistic interpretability workshop as a Spotlight talk!
Rethinking Artistic Copyright Infringements in the Era of Text-to-Image Generative Models
EvalEval @ NeurIPS 2024 (Oral)
A new framework for copyright infringement detection from text-to-image models!
Understanding Information Storage and Transfer in Multimodal Language Models
NeurIPS 2024
Causality based approaches for model interpretability and editing methods extended to multimodal language models (e.g., LLaVa)
Also appearing at ECCV 2024 KGM Workshop as Oral Presentation!
Understanding and Mitigating Compositionality Issues in Text-to-Image Models
We find two sources of error which leads to erroneous compositionality in text-to-image models and mitigate it with a strong baseline!
A Survey of Small Language Models
A comprehensive survey on how to create small language models! Collaborators across multiple universities and research labs!
Selected Publications
On Mechanistic Knowledge Localization in Text-to-Image Models
ICML 2024 Project Poster Code Website
We investigate cross-attention layers in text-to-image models on knowledge storage!
Localizing and Editing Knowledge in Text-to-Image Models
ICLR 2024 Project Poster Code Website
We propose an interpretability framework + model editing method to ablate concepts from text-to-image models, fast!
Distilling Knowledge From Text-to-Image Models Improves Visio-Linguistic Reasoning in CLIP
EMNLP 2024 (Main)
We propose a knowledge-distillation technique to improve reasoning abilities in CLIP!
IntCoOp: Interpretability Aware Vision-Language Prompt Tuning
EMNLP 2024 (Main) Code
We propose a new prompt tuning method for VLMs by using inductive biases from interpretable image level attributes!
EditVal: Benchmarking Diffusion Based Text-Guided Image Editing Methods
We propose a new comprehensive benchmark for evaluating diffusion based editing methods!
On Surgical Finetuning for Language Encoders
EMNLP 2023 Code
Method to surgically finetune language encoders with a subset of layers to perform close to full-finetuning!
Strong Baselines for Parameter-Efficient Few-Shot Fine-Tuning
AAAI 2024 Code
We propose two easy-to-implement strong baselines for PEFT which leads to SoTA on MD!
Hard Meta-Dataset++: Towards Understanding Few-shot Performance on Difficult Tasks
ICLR 2023 Code / Media Coverage
We propose a fast algorithm - FastDiffSel which can extract difficult few-shot tasks in a computational efficient way from large vision datasets!
Strategies to Improve Few-Shot Learning for Intent Classification and Slot Filling
NAACL 2022 (SUKI)
Propose empirical strategies to improve few-shot performance for joint intent classification and slot-filling.
ICLR 2021
End to end analysis of influence functions in deep learning!
On Second-Order Group Influence Functions for Black-Box Predictions
ICML 2020
We propose second-order group influence functions, which are better suited to handle group effects!