Chinmay Karkar

researcher · pre-training & post-training of language models

currently at Microsoft Research India · previously Lossfunk & Athena Agents (RL)

about

I work on the science and craft of training language models — what makes them learn, what makes them stable over long horizons, and how to tell whether they’re actually getting better.

At Microsoft Research India I’m focused on reinforcement learning for LLMs and agentic verifier modules for multi-step reasoning evaluation. Before that, probabilistic forecasting at Lossfunk, model merging & RL post-training at Athena, and a fine-tuning / inference pipeline at Kotoba Research.

selected work

2026

Microsoft Research India — Research Intern

Reinforcement learning for LLMs with attention to long-horizon training dynamics and stability. Agentic verifier modules for robust multi-step reasoning evaluation.

2025

Lossfunk — Research Intern

First-author paper on LLM forecasting behavior, accepted at the AIR-FM Workshop, AAAI 2026. Calibration, Brier, ECE, and pass@k pipelines for probabilistic forecasting. arXiv:2511.18394 ↗

2025

Athena Agents — Research Intern

Built Aryabhatta 1.0, a domain-adapted LLM for JEE Main math — 90.2% accuracy, +35.5 pp over baseline. Model merging (SLERP, TIES) blended with GRPO / REINFORCE post-training.

2024

Swades AI — Applied AI Intern

Computer vision pipelines for segmentation and keypoint estimation; document-extraction chunking; containerized deploy that took per-page processing from 2 min to 3 s.

2024

Kotoba Research — Research Intern

Fine-tuning and inference pipeline; QAT and LoRA benchmarking across standardized evaluations.

projects

elsewhere