Do Audio LLMs Listen or Read? Analyzing and Mitigating Paralinguistic Failures with VoxParadox
Jiacheng Pang*, Ashutosh Chaubey*, Mohammad Soleymani
ICML, 2026
Multimodal LLMs
Audio
Push, Pop, Parallelize: Stack-Augmented Linear Attention via the Delta Rule
Anh T Nguyen, Saleh Momeni, Ashutosh Chaubey, Changnan Xiao, Bing Liu
ICML, 2026
Multimodal LLMs
MoD-DPO: Towards Mitigating Cross-modal Hallucinations in Omni LLMs using Modality Decoupled Preference Optimization
Ashutosh Chaubey, Jiacheng Pang, Mohammad Soleymani
CVPR, 2026
Multimodal LLMs
Computer Vision
Reinforcement Learning
Audio
AVERE: Improving Audiovisual Emotion Reasoning with Preference Optimization
Ashutosh Chaubey, Jiacheng Pang, Maksim Siniukov, Mohammad Soleymani
ICLR, 2026
Multimodal LLMs
Computer Vision
Reinforcement Learning
Audio
LibreFace 2.0: Leveraging Large-Scale Synthetic Data for Fair and Generalizable Facial Analysis
Xulang Guan*, Ashutosh Chaubey*, Maksim Siniukov, Belle Hsieh, Zongjian Li, Mohammad Soleymani
FG, 2026 (Round 1)
Computer Vision
Face Analysis
Face-LLaVA: Facial Expression and Attribute Understanding through Instruction Tuning
Ashutosh Chaubey, Xulang Guan, Mohammad Soleymani
WACV, 2026 (Round 1)
Multimodal LLMs
Computer Vision
Face Analysis
Reasoning Improves Human Alignment in LLM Judgment and Choice
Ala N. Tak, Amin Banayeeanzade, Anahita Bolourani, Fatemeh Bahrani, Ashutosh Chaubey, Sai Praneeth Karimireddy, Norbert Schwarz, Jonathan Gratch
ICLR 2026 Workshop on Representational Alignment (Re^4-Align)
Multimodal LLMs