Hi, I’m Akhil. I’m an AI Engineer and MS student in Applied Machine Learning at the University of Maryland. Before UMD, I was an AI Engineer at Atrium on a Pfizer R&D project (co-authored a peer-reviewed publication in Clinical Trials), and spent three years as an ML Engineer at Tezo in India.
My work focuses on LLM inference: custom Triton kernels, paged KV-cache, speculative decoding, and distributed orchestration with NVIDIA Dynamo.
MS Applied Machine Learning
News
-
Triton kernels: vector add through FlashAttention-2 on H100 (16.9× speedup @ 8K context). Builds · Repo
-
Built coding-agent: LangGraph CLI agent with repo-scoped tools, self-hosted vLLM inference. Builds · Repo
-
mini-vllm: paged KV + continuous batching for Qwen2.5-7B on H100 (2.6× throughput, 3.5× capacity, 160× max TTFT). Builds · Repo
-
GPU Fundamentals & LLM Inference — 15k+ views on LinkedIn. Post
-
Co-authored a peer-reviewed paper in Clinical Trials (SAGE) with Pfizer R&D. Paper
Pages
-
Blog
Inference, training, and building AI systems from scratch.
-
Builds
coding-agent, mini-vllm, Triton kernels, agent framework.
-
Experience
Atrium (Pfizer), Tezo — 3+ years in production ML.