Selected research and applied AI projects. Click any card to explore the full case study.
Advanced Retrieval-Augmented Generation system with multi-stage retrieval, re-ranking, and query decomposition for enterprise conversational AI.
Enterprise conversational AI systems struggle with hallucination, outdated knowledge, and inability to reason over large proprietary corpora in real time.
Core architecture powering the Amazon Q Chat Engine, serving millions of enterprise users with accurate, grounded conversational AI.
Multi-step agentic reasoning framework with structured function calling, tree-of-thought planning, and constrained decoding for enterprise tool orchestration.
Enterprise LLMs must decompose complex user requests into sequences of API calls, database queries, and tool invocations — maintaining coherence across multi-hop reasoning chains while handling errors gracefully and respecting authorization boundaries.
Core orchestration layer for enterprise agentic AI, enabling complex multi-tool workflows with reliable structured outputs across diverse enterprise knowledge sources and APIs.
End-to-end alignment pipeline combining DPO/KTO preference optimization, constitutional AI guardrails, automated red-teaming, and PII-aware decoding for enterprise LLM deployment.
Deploying LLMs in enterprise environments demands rigorous alignment — balancing helpfulness with safety, preventing PII leakage, mitigating hallucination, and ensuring compliance with enterprise policies — where a single violation can have serious consequences.
Safety and alignment infrastructure for trusted enterprise LLM deployment, ensuring compliant and reliable AI interactions at scale across regulated industries.
Production LLM serving stack combining speculative decoding, KV-cache compression, quantization-aware fine-tuning, and continuous batching for enterprise-scale inference.
Serving large language models at enterprise scale demands sub-second latency, high throughput, and cost efficiency — while preserving output quality across diverse workloads with millions of concurrent users.
Powers high-throughput, low-latency LLM serving infrastructure for millions of concurrent enterprise users, reducing inference cost while maintaining quality guarantees.
State-of-the-art semantic segmentation for AR/VR eye tracking using SWIN Vision Transformers, achieving 0.96 mIoU with on-device deployment.
AR/VR devices require pixel-precise eye region segmentation at real-time speeds on power-constrained hardware, with robustness to extreme lighting and motion.
Deployed in Meta Reality Labs AR/VR pipeline, enabling precise gaze tracking and foveated rendering for next-generation headsets.
Cross-modal knowledge transfer framework using CLIP and Vision Transformers for zero-shot Visual Question Answering and Image Captioning.
Bridging vision and language modalities for VQA and captioning requires massive paired datasets. Enabling zero-shot transfer to new domains without task-specific fine-tuning remains an open problem.
Framework adopted across Meta product surfaces for visual understanding tasks, reducing annotation cost and enabling rapid domain expansion.
Cross-lingual language understanding using self-supervised transformers for underrepresented languages with minimal labeled data.
Most NLP advances concentrate on high-resource languages. Extending LLM capabilities to hundreds of low-resource languages requires novel transfer learning and data augmentation strategies.
Enables language understanding for underrepresented populations, supporting equitable AI deployment across global markets.
AI-driven pathology analysis system for tumor detection and localization using hierarchical CNNs and self-supervised learning on whole-slide images.
Manual pathology review is slow and error-prone. Whole-slide images are gigapixel-scale, requiring architectures that handle extreme resolution while maintaining fine-grained localization.
Accelerates pathology workflows and provides decision support for clinicians, reducing diagnostic turnaround time.
Adversarial AI and causal inference framework for unbiased disease prediction, combining GANs with Double ML for robust diagnostic models.
Clinical prediction models inherit biases from training data — demographic, socioeconomic, and selection biases — leading to disparate outcomes across patient populations.
Advances equitable healthcare AI by ensuring diagnostic models perform fairly across all patient populations.
Privacy-preserving synthetic medical data using diffusion models and convolutional GANs with formal differential privacy guarantees.
Healthcare AI research is bottlenecked by data access — patient privacy regulations (HIPAA) prevent sharing real medical records, limiting model development and reproducibility.
Unlocks healthcare AI research by providing shareable, privacy-safe synthetic datasets — cited 140+ times and adopted by research groups globally.
Production recommender system using Transformer-XL and meta-learning for temporal-aware personalization at billion-user scale.
User preferences evolve over time and new users lack interaction history. Traditional collaborative filtering fails to capture temporal dynamics and suffers from cold-start problems.
Serving billions of daily predictions in Meta's recommendation surfaces, directly impacting user engagement and content discovery.