Janani SV
AI Engineer · MLOps · ML Systems
Software Engineer → Data Scientist → AI Engineer

Building reliable AI systems that move from experiments to production.

I design, build, and deploy production-grade ML applications and inference systems, and architect scalable MLOps platforms with a focus on security and operational excellence.

8+
Production AI initiatives
CI/CD
For model and platform delivery
Focus
Inference, observability, MLOps
Featured Focus
End-to-End AI Engineering Stack
Active
Data
Ingestion, ETL pipelines, batch and streaming data flows, validation
Features
Feature pipelines, reusable transformations, offline and online consistency
Model Lifecycle
Training workflows, experiment tracking, model registry, versioning
Pipelines
GitLab CI/CD, orchestration, validation, promotion and rollback
Inference
Triton, APIs, model delivery, scaling, serving patterns
Evaluation
Offline metrics, validation checks, A/B testing and shadow deployments
Operations
Metrics, drift detection, logging, observability and reliability
Governance
Secure delivery, access control, auditability and compliance
Platform
Kubernetes, OpenShift, Helm, Terraform and runtime operations
Software Engineer - Fullstack

Foundation in building structured systems, maintainable codebases, APIs, and delivery workflows.

Data Scientist

Experience in experimentation, modelling, feature design, and turning business problems into measurable ML outcomes.

AI Engineer with MLOps

Current focus on deploying, scaling, monitoring, and governing ML systems in production environments.

Projects

Selected work

NVIDIA Triton Inference Platform

Production-grade blueprint for deploying and operating scalable ML inference workloads on Kubernetes and OpenShift, leveraging NVIDIA Triton Inference Server.

Kubernetes · OpenShift · Helm · Observability
View case study →
Enterprise MLOps Pipeline

CI/CD design for training, validation, model registry, deployment, and monitoring.

GitLab · MLflow · SageMaker · Terraform
View case study →
Secure RAG Application

LLM application with retrieval, filtering, observability, and secure logging controls.

Python · Vector DB · Guardrails · Monitoring
View case study →
Writing

Technical articles

View all →
System Design

System Design for Beginners

The system design fundamentals I wish someone explained to me when I was starting out — no fluff, just the mental models that actually matter.

Read article →
Prometheus

Prometheus on OpenShift for Production ML Monitoring

The monitoring setup I couldn't find a good guide for — Prometheus + Thanos on OpenShift for ML inference workloads. Metrics collection, reliability tracking, and observability from scratch.

Read article →
MLOPS

MLOPS Architecture

How I approach production ML inference on Kubernetes — model storage decisions, OCI-baked deployments, zero-downtime rollouts, and the Helm chart patterns that make life easier for app teams.

Read article →
MLOps

Model Drift — A Complete Guide

Everything I've learned about model drift after watching production models quietly degrade — detection methods, the stats behind them, and knowing when to retrain vs. when to wait.

Read article →
AI

AI Agents — What They Are and How to Use Them

What I learned after a year of building with AI agents — how they actually differ from chatbots, the patterns that work, and where I use them in my own workflow.

Read article →
MLOps

How Big Companies Do Drift Monitoring

I dug through engineering blogs from Uber, Netflix, LinkedIn, Airbnb, Meta, and Google to figure out how they actually handle drift. Six companies, six architectures — all sourced.

Read article →
LLM

LLM Fundamentals — Deep Dive

My working notes on how LLMs actually work — from attention mechanics and tokenization to RLHF, LoRA, quantisation, and scaling laws. Written while trying to explain these things precisely.

Read article →
Resume Access

Unlock my latest resume

Enter your email and complete a quick anti-bot check. This keeps the resume endpoint protected from automated abuse.

Anti-bot check answer: 12

Architecture Highlights

How I build production AI systems

My focus is not only on building models, but on designing reliable, scalable, and observable AI systems that operate effectively in real-world environments.

End-to-end MLOps pipelines
Scalable inference with Triton
Model versioning & OCI artifacts
Monitoring, drift & observability