I am a Senior Deep Learning Algorithm Engineer at NVIDIA working on efficient foundation models across language, reasoning, and biology. My current work focuses on improving the efficiency of large language models using model compression, structured pruning, distillation, quantization, hybrid Mamba-Transformer architectures, and elastic multi-budget inference.
I contribute to NVIDIA model families and tooling including Nemotron, NeMo Framework, Megatron-LM, Megatron Bridge, BioNeMo, ModelOpt, TensorRT-LLM, and vLLM. Recent work includes Nemotron Nano 2, Minitron-SSM, Nemotron Elastic, Evo 2, and EDEN.
Prior to joining NVIDIA full-time, I completed my Ph.D. in Mechanical Engineering at the University of Illinois Urbana-Champaign, advised by Matthew West. My Ph.D. work used graph neural networks and reinforcement learning to accelerate scientific computing methods such as algebraic multigrid and domain decomposition solvers.
Publications
[ Google Scholar ]|
Nemotron Elastic: Towards Efficient Many-in-One Reasoning LLMs A. Taghibakhshi, S. T. Sreenivas, S. Muralidharan, R. Cai, M. Chochowski, A. S. Mahabaleshwarkar, Y. Suhara, O. Olabiyi, D. Korzekwa, M. Patwary, M. Shoeybi, J. Kautz, B. Catanzaro, A. Aithal, N. Tajbakhsh, P. Molchanov ICML 2026. [pdf] |
|
Scaling Laws and Architectural Frontiers in Metagenomic Foundation Models G. Munsamy, G. Ayres, J. Dona, C. Greco, D. Anderson, S. Sridhar, W. Chow, A. Kollasch, R. Pecoraro, T. Bohnuud, K. Kam, G. Minto-Cowcher, M. Leung, H. Sirelkhatim, J. St. John, A. Taghibakhshi, T. Shimko, J. Wilbur, T. Rvachov, S. Paliwal, et al. ICML 2026. [paper] |
|
Minitron-SSM: Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning A. Taghibakhshi, S. T. Sreenivas, S. Muralidharan, M. Chochowski, Y. Karnati, R. Joshi, A. S. Mahabaleshwarkar, Z. Chen, Y. Suhara, O. Olabiyi, D. Korzekwa, M. Patwary, M. Shoeybi, J. Kautz, B. Catanzaro, A. Aithal, N. Tajbakhsh, P. Molchanov NeurIPS 2025. [pdf] |
|
Genome Modelling and Design Across All Domains of Life with Evo 2 G. Brixi, M. G. Durrant, J. Ku, M. Naghipourfar, M. Poli, G. Sun, G. Brockman, D. Chang, A. Fanton, G. A. Gonzalez, S. H. King, D. B. Li, A. T. Merchant, E. Nguyen, C. Ricci-Tam, D. W. Romero, J. C. Schmok, A. Taghibakhshi, A. Vorontsov, B. Yang, et al. Nature 2026. Part of the Core Evo 2 team; core contributor. [paper] |
|
Systems and Algorithms for Convolutional Multi-Hybrid Language Models at Scale J. Ku, E. Nguyen, D. W. Romero, G. Brixi, B. Yang, A. Vorontsov, A. Taghibakhshi, A. X. Lu, D. P. Burke, G. Brockman, S. Massaroli, C. Ré, P. D. Hsu, B. L. Hie, S. Ermon, M. Poli arXiv 2025. [pdf] |
|
Designing AI-Programmable Therapeutics with the EDEN Family of Foundation Models G. Munsamy, G. Ayres, C. Greco, K. Kam, G. Minto-Cowcher, J. St. John, T. Bohnuud, M. H. Bakalar, W. Chow, R. Pecoraro, M. D. T. Torres, A. Kollasch, M. Leung, H. Sirelkhatim, F. Farina, C. McGinnis, S. Sridhar, D. Anderson, F. Oteri, A. Taghibakhshi, et al. bioRxiv 2026. Joint second author. [pdf] |
|
NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment G. Shen, Z. Wang, O. Delalleau, J. Zeng, Y. Dong, D. Egert, S. Sun, J. Zhang, S. Jain, A. Taghibakhshi, M. Sanz Ausin, A. Aithal, O. Kuchaiev COLM 2024. [pdf] |
|
Elucidating Optimal Reward-Diversity Tradeoffs in Text-to-Image Diffusion Models R. Jena, A. Taghibakhshi, S. Jain, G. Shen, N. Tajbakhsh, A. Vahdat WACV 2025. [pdf] |
|
MG-GNN: Multigrid Graph Neural Networks for Learning Multilevel Domain Decomposition Methods A. Taghibakhshi, N. Nytko, T. U. Zaman, S. MacLachlan, L. N. Olson, M. West ICML 2023. [paper] |
|
Optimized Sparse Matrix Operations for Reverse Mode Automatic Differentiation N. Nytko, A. Taghibakhshi, T. U. Zaman, S. MacLachlan, L. N. Olson, M. West SIAM Journal on Scientific Computing 2025. [paper] |
|
Generalizing Lloyd's Algorithm for Graph Clustering T. Zaman, N. Nytko, A. Taghibakhshi, S. MacLachlan, L. Olson, M. West SIAM Journal on Scientific Computing 2024. [paper] |
|
Generalizing Reduction-Based Algebraic Multigrid T. Zaman, N. Nytko, A. Taghibakhshi, S. MacLachlan, L. Olson, M. West Numerical Linear Algebra with Applications 2024. [paper] |
|
Learning Interface Conditions in Domain Decomposition Solvers A. Taghibakhshi, N. Nytko, T. U. Zaman, S. MacLachlan, L. Olson, M. West NeurIPS 2022. [paper] |
|
Optimization-Based Algebraic Multigrid Coarsening Using Reinforcement Learning A. Taghibakhshi, S. MacLachlan, L. Olson, M. West NeurIPS 2021. [paper] |
Technical Reports
|
Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning NVIDIA, 2026. [pdf | webpage] |
|
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning NVIDIA, 2026. [pdf] |
|
NVIDIA Nemotron 3: Efficient and Open Intelligence NVIDIA, 2025. [pdf] |
|
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning NVIDIA, 2025. [pdf] |
|
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model NVIDIA, 2025. [pdf] |
|
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models NVIDIA, 2025. [pdf | webpage] |
Selected Experience
|
NVIDIA, Senior Deep Learning Algorithm Engineer Aug. 2023 - Present. Core Nemotron compression work across Nemotron Nano 2, Minitron-SSM, and Nemotron Elastic; NeMo and Megatron distributed-training stack contributions; Evo 2 and EDEN generative Bio-ML. |
|
NVIDIA, Deep Learning Algorithms Intern May 2022 - Aug. 2022. Hierarchical graph neural network with cross-attention for billion-edge cross-device user matching, improving state of the art by 5%. |
|
John Deere, Machine Learning Intern May 2020 - May 2022. Reinforcement-learning and computer-vision methods for autonomous mower docking, parking assist, planting, and scene reconstruction. |
Mentorship
NVIDIA
- Rohit Jena (University of Pennsylvania)
- Aditya Vavre (University of Texas Austin)
Education
University of Illinois Urbana-Champaign
Ph.D., Mechanical Engineering, 2023. Advisor: Matthew West.
Sharif University of Technology
B.Sc., Mechanical Engineering, 2019.