AI Engineer — Agents, Post-Training, Inference
Excited about building agents that can plan, reason, and act across complex multi-step tasks, post-training techniques that unlock new capabilities, and pushing inference to go faster at scale.
Currently: MS in Applied Machine Learning at University of Maryland, College Park
Previously: AI Engineer at Atrium (client: Pfizer) · 3 years building ML systems at Tezo
Worked at Atrium with Pfizer's AI team to automate how statisticians write analysis plans. Built a RAG pipeline that cut drafting time by 60%, and an LLM-as-a-Judge system that catches hallucinations with 80% precision.
Built a RAG-powered chatbot that let employees search across 10,000+ internal documents. Reduced document lookup time by 60%. Also trained fraud detection models that improved recall by 15%.
Designed and implemented a modular AI agent framework in Python, without relying on LangChain, CrewAI, or any existing agent library. Features a think-act reasoning loop that enables LLMs to autonomously chain tool calls across multiple steps to solve complex tasks.
Implemented the complete post-training pipeline to turn a base LLM into a reasoning model. Covers inference-time scaling, self-refinement, and reinforcement learning with verifiable rewards (GRPO). No TRL, no alignment libraries. Inspired by DeepSeek-R1.
A deep dive into how LLMs reason over long-horizon tasks, the mechanics behind context length, and why smart agents with surgical retrieval beat brute-force long context windows.
Build intuition for LLM inference from first principles: GPU architecture, the roofline model, memory estimation, and latency.
A deep dive into self-hosting LLMs using vLLM on RunPod. Covers PagedAttention, continuous batching, and cost analysis.