AI Engineer — Agents, Large Scale Training, Inference
Excited about building agents that can plan, reason, and act across complex multi-step tasks, large scale training techniques that unlock new capabilities, and pushing inference to go faster at scale.
Currently: MS in Applied Machine Learning at University of Maryland, College Park
Previously: AI Engineer at Atrium (client: Pfizer) · 3 years building ML systems at Tezo
Co-authored a peer-reviewed publication in Clinical Trials (SAGE). Automated SAP generation for Pfizer, cutting drafting time by 60%. Built an LLM-as-a-Judge system with 82% precision for hallucination detection.
Built a RAG-powered chatbot that let employees search across 1,000+ internal documents. Reduced document lookup time by 61%. Also trained fraud detection models that improved recall by 12%.
Designed and implemented a modular AI agent framework in Python, without relying on LangChain, CrewAI, or any existing agent library. Features a think-act reasoning loop that enables LLMs to autonomously chain tool calls across multiple steps to solve complex tasks.
Commercial inference engines like vLLM and TGI are powerful but opaque. Built a full inference engine from scratch to understand every layer of the serving stack — from kernel-level attention to request scheduling.
A deep dive into how LLMs reason over long-horizon tasks, the mechanics behind context length, and why smart agents with surgical retrieval beat brute-force long context windows.
Build intuition for LLM inference from first principles: GPU architecture, the roofline model, memory estimation, and latency.
A deep dive into self-hosting LLMs using vLLM on RunPod. Covers PagedAttention, continuous batching, and cost analysis.