Ali Taghibakhshi

Ali
Taghibakhshi

Senior Deep Learning Algorithm Engineer, NVIDIA
Email: a.t.bakhshi [at] gmail.com

I am a Senior Deep Learning Algorithm Engineer at NVIDIA working on efficient foundation models across language, reasoning, and biology. My current work focuses on improving the efficiency of large language models using model compression, structured pruning, distillation, quantization, hybrid Mamba-Transformer architectures, and elastic multi-budget inference.

I contribute to NVIDIA model families and tooling including Nemotron, NeMo Framework, Megatron-LM, Megatron Bridge, BioNeMo, ModelOpt, TensorRT-LLM, and vLLM. Recent work includes Nemotron Nano 2, Minitron-SSM, Nemotron Elastic, Evo 2, and EDEN.

Prior to joining NVIDIA full-time, I completed my Ph.D. in Mechanical Engineering at the University of Illinois Urbana-Champaign, advised by Matthew West. My Ph.D. work used graph neural networks and reinforcement learning to accelerate scientific computing methods such as algebraic multigrid and domain decomposition solvers.

Publications

[ Google Scholar ]
Nemotron Elastic: Towards Efficient Many-in-One Reasoning LLMs
A. Taghibakhshi, S. T. Sreenivas, S. Muralidharan, R. Cai, M. Chochowski, A. S. Mahabaleshwarkar, Y. Suhara, O. Olabiyi, D. Korzekwa, M. Patwary, M. Shoeybi, J. Kautz, B. Catanzaro, A. Aithal, N. Tajbakhsh, P. Molchanov
ICML 2026. [pdf]
Scaling Laws and Architectural Frontiers in Metagenomic Foundation Models
G. Munsamy, G. Ayres, J. Dona, C. Greco, D. Anderson, S. Sridhar, W. Chow, A. Kollasch, R. Pecoraro, T. Bohnuud, K. Kam, G. Minto-Cowcher, M. Leung, H. Sirelkhatim, J. St. John, A. Taghibakhshi, T. Shimko, J. Wilbur, T. Rvachov, S. Paliwal, et al.
ICML 2026. [paper]
Minitron-SSM: Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning
A. Taghibakhshi, S. T. Sreenivas, S. Muralidharan, M. Chochowski, Y. Karnati, R. Joshi, A. S. Mahabaleshwarkar, Z. Chen, Y. Suhara, O. Olabiyi, D. Korzekwa, M. Patwary, M. Shoeybi, J. Kautz, B. Catanzaro, A. Aithal, N. Tajbakhsh, P. Molchanov
NeurIPS 2025. [pdf]
Genome Modelling and Design Across All Domains of Life with Evo 2
G. Brixi, M. G. Durrant, J. Ku, M. Naghipourfar, M. Poli, G. Sun, G. Brockman, D. Chang, A. Fanton, G. A. Gonzalez, S. H. King, D. B. Li, A. T. Merchant, E. Nguyen, C. Ricci-Tam, D. W. Romero, J. C. Schmok, A. Taghibakhshi, A. Vorontsov, B. Yang, et al.
Nature 2026. Part of the Core Evo 2 team; core contributor. [paper]
Systems and Algorithms for Convolutional Multi-Hybrid Language Models at Scale
J. Ku, E. Nguyen, D. W. Romero, G. Brixi, B. Yang, A. Vorontsov, A. Taghibakhshi, A. X. Lu, D. P. Burke, G. Brockman, S. Massaroli, C. Ré, P. D. Hsu, B. L. Hie, S. Ermon, M. Poli
arXiv 2025. [pdf]
Designing AI-Programmable Therapeutics with the EDEN Family of Foundation Models
G. Munsamy, G. Ayres, C. Greco, K. Kam, G. Minto-Cowcher, J. St. John, T. Bohnuud, M. H. Bakalar, W. Chow, R. Pecoraro, M. D. T. Torres, A. Kollasch, M. Leung, H. Sirelkhatim, F. Farina, C. McGinnis, S. Sridhar, D. Anderson, F. Oteri, A. Taghibakhshi, et al.
bioRxiv 2026. Joint second author. [pdf]
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment
G. Shen, Z. Wang, O. Delalleau, J. Zeng, Y. Dong, D. Egert, S. Sun, J. Zhang, S. Jain, A. Taghibakhshi, M. Sanz Ausin, A. Aithal, O. Kuchaiev
COLM 2024. [pdf]
Elucidating Optimal Reward-Diversity Tradeoffs in Text-to-Image Diffusion Models
R. Jena, A. Taghibakhshi, S. Jain, G. Shen, N. Tajbakhsh, A. Vahdat
WACV 2025. [pdf]
MG-GNN: Multigrid Graph Neural Networks for Learning Multilevel Domain Decomposition Methods
A. Taghibakhshi, N. Nytko, T. U. Zaman, S. MacLachlan, L. N. Olson, M. West
ICML 2023. [paper]
Optimized Sparse Matrix Operations for Reverse Mode Automatic Differentiation
N. Nytko, A. Taghibakhshi, T. U. Zaman, S. MacLachlan, L. N. Olson, M. West
SIAM Journal on Scientific Computing 2025. [paper]
Generalizing Lloyd's Algorithm for Graph Clustering
T. Zaman, N. Nytko, A. Taghibakhshi, S. MacLachlan, L. Olson, M. West
SIAM Journal on Scientific Computing 2024. [paper]
Generalizing Reduction-Based Algebraic Multigrid
T. Zaman, N. Nytko, A. Taghibakhshi, S. MacLachlan, L. Olson, M. West
Numerical Linear Algebra with Applications 2024. [paper]
Learning Interface Conditions in Domain Decomposition Solvers
A. Taghibakhshi, N. Nytko, T. U. Zaman, S. MacLachlan, L. Olson, M. West
NeurIPS 2022. [paper]
Optimization-Based Algebraic Multigrid Coarsening Using Reinforcement Learning
A. Taghibakhshi, S. MacLachlan, L. Olson, M. West
NeurIPS 2021. [paper]

Technical Reports

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
NVIDIA, 2026. [pdf | webpage]
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
NVIDIA, 2026. [pdf]
NVIDIA Nemotron 3: Efficient and Open Intelligence
NVIDIA, 2025. [pdf]
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
NVIDIA, 2025. [pdf]
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
NVIDIA, 2025. [pdf]
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
NVIDIA, 2025. [pdf | webpage]

Selected Experience

NVIDIA, Senior Deep Learning Algorithm Engineer
Aug. 2023 - Present. Core Nemotron compression work across Nemotron Nano 2, Minitron-SSM, and Nemotron Elastic; NeMo and Megatron distributed-training stack contributions; Evo 2 and EDEN generative Bio-ML.
NVIDIA, Deep Learning Algorithms Intern
May 2022 - Aug. 2022. Hierarchical graph neural network with cross-attention for billion-edge cross-device user matching, improving state of the art by 5%.
John Deere, Machine Learning Intern
May 2020 - May 2022. Reinforcement-learning and computer-vision methods for autonomous mower docking, parking assist, planting, and scene reconstruction.

Mentorship

NVIDIA

  • Rohit Jena (University of Pennsylvania)
  • Aditya Vavre (University of Texas Austin)

Education

University of Illinois Urbana-Champaign
Ph.D., Mechanical Engineering, 2023. Advisor: Matthew West.

Sharif University of Technology
B.Sc., Mechanical Engineering, 2019.