Vikas Kumar Singh

AI Systems Architect & Engineering Leader

Bangalore, India  ·  Actively seeking senior AI roles in the UAE & globally  ·  Open to relocation

Business Impact at a Glance

$2B Projected drug supply waste avoidance (10-yr pharma programme)
$50M+ Monthly CAPEX decisions informed
28 Engineers led across cross-org programmes
22× Query latency improvement (45s → <2s)
65% LLM inference cost reduction
$1M+ Annual system overhead eliminated

Profile

AI Systems Architect specialising in enterprise AI platform design, probabilistic forecasting, and GenAI system delivery for regulated, high-stakes industries - pharma supply chain, large-scale capital allocation, and commercial operations at Fortune 500 scale.

Built and shipped end-to-end production systems where errors have irreversible consequences: Bayesian MCMC inference for $2B drug supply planning, geospatial ML informing $50M+ monthly CAPEX decisions, and multi-agent GenAI platforms serving enterprise leadership at sub-100ms latency. Delivered all of this in a forward-deployed, customer-facing capacity - owning architecture, stakeholder relationships, and business outcomes end-to-end, not just the engineering.

Led cross-functional engineering programmes of 4–28 engineers across Fortune 500 engagements at ZS Associates and Tredence Analytics - sitting directly with CXO and VP-level customer leadership to shape technical direction, then delivering the systems that backed those decisions.

Core Expertise

GenAI & LLM Systems

  • Multi-Agent Orchestration
  • RAG Architecture & Retrieval Optimisation
  • LLM-as-Judge Evaluation Frameworks
  • Token Budgeting & Cost Governance
  • Prompt Engineering & Output Validation

Data Platforms

  • Databricks Lakehouse (Medallion Architecture)
  • Unity Catalog & Delta Lake
  • PySpark - billion-row pipelines
  • AWS Glue, Redshift
  • Enterprise Data Modelling & OLAP

Engineering & MLOps

  • Async FastAPI (concurrent, batching, pooling)
  • Docker, Kubernetes
  • MLflow - model lifecycle & governance
  • CI/CD & Automated Compliance Validation
  • Regulated Systems: HIPAA / GDPR / Solvency II

Selected Case Studies

Production systems designed under real-world scale, governance, and cost constraints.

$100K/yr saved  ·  40 FTE automated  ·  1,000+ concurrent users  ·  65% inference cost reduction

Multi-Agent GenAI Analytics Orchestration

Role: Principal AI Architect

Orchestrated the delivery of a production-grade GenAI platform for executive analytics at a large-scale eCommerce client. Engineered a custom async routing layer to bypass framework overhead, automated 60 weekly reports to replace 40 FTE of manual analysis, and slashed LLM inference costs by 65% through semantic caching and dynamic routing.

Read Case Study →
$50M+/month CAPEX informed  ·  70% faster approvals  ·  22× latency improvement  ·  15% accuracy lift vs 3rd-party

Geospatial ML Site Selection & Forecasting

Role: Lead AI Architect

Standardized new-store site selection for a $10B+ retailer by designing a Databricks Lakehouse-based geospatial ML engine. Replaced intuition-driven real estate decisions with a defensible forecasting system that accelerated approval cycles from 3 weeks to 5 days, informing over $50M in monthly capital allocation.

Read Case Study →
View All Case Studies →

Professional Experience

Tredence Analytics - Data Science Manager & Principal AI Architect

Bangalore, India  ·  Nov 2024 – Present

  • Designed and owned a multi-modal analytics platform (FastAPI + Databricks) for a large eCommerce client: architected for 1,000+ concurrent users at <100ms latency; replaced 40 FTE manual reporting with agentic workflow orchestration (60 reports/week automated). Economic impact: $100K/year operational overhead elimination.
  • Implemented cost-governance layer (token budgeting, semantic caching, dynamic LLM routing): cut inference costs 65% ($8 → $2.50/request) while maintaining output quality; scaled architecture to 10K concurrent users.
  • Owned 1TB+ geospatial Lakehouse: migrated from schema-on-read Hive to star-schema Delta + Unity Catalog; designed dynamic partitioning for 200M geospatial features. Query latency: 45s → <2s (22× improvement). Site-selection approval cycles: 3 weeks → 5 days (70% acceleration). Direct business impact: $50M+ CAPEX allocation decisions informed monthly.
  • Led cloud migration from legacy on-prem ETL to AWS Glue + Databricks. Designed compute governance framework (auto-scaling, workload isolation, cost allocation by business unit). Result: 30% cloud cost reduction, eliminated $1M/year system maintenance overhead.
  • Served as SME and implementation architect for GenAI/Analytics solutions - driving platform readiness, team enablement, and phased solution rollout across client engagements.

ZS Associates - Business Technology Solutions Consultant, AI Systems

Pune, India  ·  Apr 2021 – May 2024

  • Architected demand forecasting engine for a $10B+ pharmaceutical client: chose Bayesian MCMC over standard econometrics for native uncertainty quantification and faster convergence on irregular clinical patterns. Owned end-to-end: model R&D, Spark feature pipeline (500M+ SKU-day pairs), inference orchestration, MLOps. Economic outcome: $2B supply chain waste prevention over 5 years.
  • Implemented Spark-based next-best-action recommendation engine for MSL engagement, interfacing Veeva CRM with MLflow model monitoring and feature drift response.
  • Scaled AI delivery organisation: built MLOps platform (MLflow versioning, automated model validation, A/B testing, audit logging) enabling 28 distributed engineers to ship 6 concurrent enterprise projects without workflow bottlenecks. Built compliance automation layer ensuring HIPAA/GDPR validation on every model commit.

Collabera Technologies - Data Engineer

Pune, India  ·  Sep 2020 – Apr 2021

  • Architected HIPAA-compliant Real-World Data (RWD) platform: PII detection, field-level encryption, audit logging. Processed 500M+ healthcare records/day while maintaining <10MB per-patient data footprint. Integrated Komodo data for HCP scoring.
  • Eliminated BI tool vendor dependency ($2M+ contract cycle): engineered custom Python analytics layer with sub-second response on aggregate queries across 100M+ patient records.

L&T Infotech - Engineer

Pune, India  ·  Sep 2016 – May 2020

  • Owned ML-driven insurance risk platform: API-integrated model serving at sub-50ms inference SLA, multi-armed bandit A/B testing framework for 10+ concurrent models, feature store for 500+ features with drift detection.
  • Engineered GDPR/Solvency II compliant Spark ETL: immutable transaction logs, cryptographic data lineage, automated compliance validation on every pipeline run.

Certifications

  • Generative AI Solutions Architect (2025)
  • Databricks Certified GenAI Engineer (2025)
  • AI/ML for Geodata Analysis - ISRO
  • Master's in Applied Data Science - WorldQuant University
  • Certified NLP Developer - Vskills
  • Certified Python Developer - Vskills
  • Certified Tableau Author - Tableau
  • Math for Machine Learning - Amazon

Education

  • B.Tech, Electronics Engineering
    BVDU College of Engineering, Pune (2016)
  • Master of Science, Applied Data Science
    WorldQuant University (2021)
  • Diploma, Network Security
    BVDU College of Engineering (2015)