Brief Bio

I am Mayank Singh, an Assistant Professor in the Department of Computer Science and Engineering at IIT Gandhinagar. I lead the Lingo Research Group, focusing on Indic Large Language Models (LLMs), Multimodal AI, Scholarly AI, Interpretable and Explainable AI, and Code-Mixed NLP.

Our group contributed Hindi and Gujarati corpora to BLOOM (BigScience 2022), spearheaded the Ganga LLM initiative — India’s first from-scratch small LLMs for edge deployment (30k+ downloads) — and now co-leads data curation and evaluation for the ₹175+ Cr EKA Project (India’s largest 120B+ parameter LLM). We have released major datasets and tools such as Triveni, SangrahATox, UnityAI-Guard, and Eka-Eval (35+ benchmarks).

Beyond LLMs, we have developed widely used tools and datasets for scholarly AI (OCR++, TabLeX, LineEX, NLPExplorer, CL Scholar) and are recognized for pioneering code-mixed NLP (PHINC, HINGE,MUTANT, CoMI-LINGUA) and organizing global shared tasks (SemEval, WMT MixMT).

I have led large-scale research programs (MoE Centre of Excellence on AI for Sustainable Cities, FIST grant) and co-founded IndoML Symposium and India’s first GenAI Summer School. I have mentored 100+ BTech students, with many co-authoring A*/A papers and joining top PhD programs or research labs worldwide.


Internships and Project Oppurtunites

Internships: I am always looking forward to working with motivated students, with both remote and on-campus internship options. However, on-campus internships for external students are offered only via IITGN’s Summer Research Internship Program (SRIP).

Project Opportunities: Please check Sponsored Projects for open positions and follow our group on X and LinkedIn for updates.


The best way to contact me is to send mail to singh.mayank@iitgn.ac.in.