AI Agent Infrastructure / Evaluation / Observability
I build the infrastructure that makes AI agents reliable, observable, and enterprise-ready.
Staff Software Engineer with Apple-scale platform experience, founder roots, and a current focus on the reliability stack for agents, humanoids, and space-grade autonomy.
Role
Staff Software Engineer
AI agent infrastructure
Systems
Evaluation + observability
Reliable enterprise agents
Scale
Apple-scale platforms
Media, ads, commerce web systems
Founder
300+ builds / 100M+ txns
Founder and CTO roots
Current build surface
Agent Reliability Stack
The public focus stack behind my thesis: build agents with modern frameworks, then make them inspectable, governed, and measurable.
Operating loop
Working stack
Runtime
Agent interface
Trust layer
Infra + evals
Data plane
AI systems thesis
The future belongs to autonomous systems people can inspect, trust, and improve.
My work sits at the intersection of agents, humanoids, and space: three domains where autonomy is only valuable when it is observable, governed, and resilient.
RK-01 / RELIABILITY CORE
Autonomy becomes useful when it can be inspected.
This replaces generic cards with a navigational system: ideas orbit a central operating thesis instead of sitting in equal boxes.
Agents
Designing harnesses, memory, context engineering, orchestration, and runtime supervision for long-horizon autonomous work.
Humanoids
Thinking about the reliability layer for embodied systems: tool policy, world-state memory, fleet telemetry, and human override.
Space
Studying mission-grade autonomy patterns where delayed feedback, resilience, and auditability become existential system requirements.
The arc
Founder roots, Apple-scale platforms, then the post-AI world.
My journey connects founder speed, platform-scale engineering, and the technical pattern that matters now: making autonomous work reliable at scale.
ARC-01 / 2008 - 2012
YEH Technologies, instantPay
Founder roots
Built 300+ websites and enterprise apps, then architected fintech infrastructure processing 100M+ transactions per year.
ARC-02 / 2012 - 2017
Genpact, Flipkart
Platform builder
Moved from mobile and digital transformation into high-scale commerce systems used by thousands of operators and millions of customers.
ARC-03 / 2017 - 2023
Apple Ads, Media Products
Apple scale
Led web systems across Search Ads, News Ads, Apple Music, Apple TV, Podcasts, and Books.
ARC-04 / 2023 - now
AI Agent Infrastructure
Post-AI world
Architecting evaluation, observability, guardrails, memory, and supervision patterns for production AI agents.
Profile highlights
Real proof points behind the AI systems thesis.
The public story should not feel abstract. It comes from founder execution, fintech scale, enterprise systems, Flipkart commerce, Apple platforms, AI infrastructure, and mentorship.
FOUNDER ROOTS / YEH TECHNOLOGIES
Built 300+ websites and enterprise apps for global clients.
entrepreneurship
The early chapter was hands-on founder work: selling, designing, shipping, and learning how real customers judge software.
FINTECH SCALE / INSTANTPAY
Architected infrastructure processing 100M+ annual transactions.
CTO experience
Led the technology foundation for a multi-service fintech platform where reliability, throughput, and operator trust mattered every day.
ENTERPRISE + COMMERCE / GENPACT, FLIPKART
Built mobile, healthcare, and ecommerce systems used by thousands.
platform builder
Moved from enterprise transformation into Flipkart-scale commerce, including operator workflows and customer-facing post-order experiences.
APPLE SCALE / ADS AND MEDIA PRODUCTS
Led web systems across Apple Ads, Music, TV, Podcasts, Books, and editorial launches.
Staff engineer trajectory
This chapter proves product judgment at global scale: high-visibility consumer experiences, business tools, and cross-functional platform delivery.
AI INFRASTRUCTURE / NOW
Focused on agent observability, evaluation, guardrails, and enterprise-ready autonomy.
post-AI world
The current chapter connects everything before it: systems thinking, product taste, scale, and the reliability stack for AI agents.
MENTORSHIP / HUMAN IMPACT
Mentored 40+ engineers and professionals through the AI shift.
impact
The public mission is not only technical credibility. It is helping younger technologists see a bigger path for themselves.
What I build
A selected portfolio of systems thinking.
The work is framed around transferable architecture patterns: traces, evals, supervision, memory, policy, and high-scale product engineering.
Flagship case study
Agent Observability & Evaluation
A practical walkthrough of how to make agents measurable: traces, task success, hallucination checks, tool precision, and human escalation.
Architecture thesis
Deep Agent Runtime Patterns
Planner, executor, reviewer loops, durable memory, tool sandboxes, policy admission, and replayable execution graphs.
Pre-AI credibility
Apple-Scale Web Platforms
The product-scale foundation behind the AI chapter: media, ads, commerce, platform systems, and operator workflows.
Mission control
Reliable autonomy needs a control surface.
Agent fleets, guardrails, eval loops, human escalation, and future humanoid or space systems all need the same operating grammar: know what happened, why it happened, and when a human should intervene.
RK-MISSION-CONTROL / HUD KIT
Autonomy Operations Interface
MISSION QUEUE
AGENT FLEET
Reliable agent swarm
Long-horizon agents with trace replay, tool policy, evaluation loops, and escalation channels.
LIVE TELEMETRY
DIAGNOSTICS
GUARDRAIL MATRIX
Speaking
Conference-ready talks for the agent infrastructure era.
These topics are designed for AI agent conferences, labs, startup summits, universities, and engineering leadership communities.
TALK-01
Production AI Is Nothing Like Demo AI
AI agent conferences, engineering leadership summits, startup operator events
A field guide for agent reliability, observability, escalation, and executive trust.
TALK-02
The New Staff Engineer: Architect, Evaluator, Operator
Labs, engineering orgs, universities, founder communities
How senior technologists create leverage when AI writes more of the code.
TALK-03
From India Founder to Apple Staff Engineer
Universities, developer communities, early-career technologists
A practical reinvention story for young technologists building ambitious careers.
Thought leadership
Essays on agents, autonomy, and human ambition.
The writing hub gives founders, builders, labs, and conference organizers a clear view into how I think about the future.
ESSAY-01
Why Agent Observability Is the Bottleneck for Enterprise AI
essay direction
The move from demos to deployed agents depends on traces, evals, cost visibility, and supervision.
ESSAY-02
The Evaluation Problem for Long-Horizon Agents
essay direction
Task success, tool precision, hallucination rate, and execution quality need to become first-class metrics.
ESSAY-03
Agents, Humanoids, and Space: The Coming Autonomy Stack
essay direction
A long-term view of the reliability systems that will connect digital, embodied, and mission-grade autonomy.
Impact
Signals that speak to executives, labs, and the next generation.
This is the credibility layer: shipped scale, Staff-level judgment, founder range, and mentorship.
years building software at scale
14+
Public credibility signal
software engineer focused on AI agent infrastructure
Staff
Public credibility signal
products built as founder and CTO
300+
Public credibility signal
annual transactions in early fintech infrastructure
100M+
Public credibility signal
technologists mentored through the AI shift
40+
Public credibility signal
OPEN TO / SENIOR AI LEADERSHIP, LABS, AND SPEAKING
Bring me into the room where reliable autonomy is being shaped.
Best fit: senior AI infrastructure leadership, AI agent conference speaking, founder and lab advisory, and mentorship for ambitious young technologists.