Production-grade blueprint for deploying and operating scalable ML inference workloads on Kubernetes and OpenShift, leveraging NVIDIA Triton Inference Server.
Building reliable AI systems that move from experiments to production.
I design, build, and deploy production-grade ML applications and inference systems, and architect scalable MLOps platforms with a focus on security and operational excellence.
Foundation in building structured systems, maintainable codebases, APIs, and delivery workflows.
Experience in experimentation, modelling, feature design, and turning business problems into measurable ML outcomes.
Current focus on deploying, scaling, monitoring, and governing ML systems in production environments.
Selected work
CI/CD design for training, validation, model registry, deployment, and monitoring.
LLM application with retrieval, filtering, observability, and secure logging controls.
Technical articles
System Design for Beginners
System design basics what a beginner should know.
Read article →Prometheus on OpenShift for Production ML Monitoring
A practical starter guide to monitoring ML workloads with Prometheus on OpenShift, covering metrics collection, inference reliability, and observability basics.
Read article →Unlock my latest resume
Enter your email and complete a quick anti-bot check. This keeps the resume endpoint protected from automated abuse.
How I build production AI systems
My focus is not only on building models, but on designing reliable, scalable, and observable AI systems that operate effectively in real-world environments.