Shikhar.dev

Hi, I'm Shikhar Johri. A Machine Learning Engineer based in New York.📍

My Photo
Tech Stack
-
techtechtechtechtechtechtechtechtechtechtechtechtechtech
setup

about me

A Machine Learning Engineer building personalization & LLM systems 📍

I build production-grade ML systems across personalization, ranking, retrieval, and LLM-powered experiences. At Peloton, I work on agentic AI workflows, recommendation quality, and system reliability at scale. Previously, I focused on data quality, analytics automation, and NLP systems at Barclays and Tata Consultancy Services.

Experience

  1. Peloton Interactive, Inc.

    Machine Learning Engineer

      • Owned and productionized an agentic AI instructor using LangGraph, enabling multi-step, guardrailed LLM workflows for personalized training plan generation.
      • Shipped onboarding personalization using GPT-4 and XGBoost to capture user mood and intent, improving early-session engagement and reducing cold-start friction.
      • Improved ranking and retrieval quality via transformer-based candidate generation combined with Neo4j graph signals, driving measurable MAP@K gains.
      • Led a cross-service redesign of plan and insight invalidation pipelines (Kafka, DynamoDB, gRPC), improving data correctness, system reliability, and long-term maintainability.
      • Raised team velocity by standardizing evaluation, monitoring, and error-handling across personalization services, while also driving cloud cost savings through infra cleanup.
  2. Barclays

    Data Scientist Co-op

      • Automated quality validation for 10K+ datasets, improving efficiency.
      • Developed a LLaMA-based metadata analysis tool, ensuring $100K savings.
      • Streamlined Monthly Business Review process, reducing analyst effort by a week.
  3. Tata Consultancy Services

    AI Engineer

      • Accelerated blood sample analysis to 800ms using YOLOv4, saving $400K annually.
      • Developed catheter inspection system with Meta Detectron2, increasing precision by 25%.
      • Built tools to convert 2D CT scans into 3D visualizations, enhancing diagnostics.
  4. Data Engineer

      • Deployed a 40TB AWS data lake with ETL pipelines for Genentech.
      • Developed a sales analysis tool generating $2M+ in revenue within one quarter.
      • Reduced storage costs by 90% through tiered data distribution.
      • Led research in NLP-based report summarization and employee recommender systems.
  5. Thapar University

    Data Science Research Intern

      • Published Adaptive Ensemble Model for COVID-19 diagnosis with 98% accuracy.
      • Discovered genetic biomarkers for Multiple Sclerosis using feature analysis.
      • Introduced a TensoPIT Bloom Filter with Kalman forecasting for real-time data caching.
  6. NE Big Data Hub

    Graduate Student Assistant

      • Led the Covid Information Commons team on pandemic policy response inference.
      • Developed dashboards and visualizations to advance data science research.

Publications

Selected papers and publications

Two-Stage Semantic GAN for Image Super-Resolution

Computer Vision and Image Understanding2025
View

Designed a GAN-based framework for semantic-aware image super-resolution.

TENSOPIT: Tensor-Structured Bloom Filters for Real-Time Data Caching

Under Review2024

Introduced a tensor-based Bloom Filter with Kalman forecasting for adaptive caching.

Machine Learning Framework for COVID-19 Auto-Detection

International Journal of Imaging Systems & Technology2021
View

Proposed an adaptive ensemble learning framework for automated COVID-19 diagnosis from medical imaging.

Serum and CSF Cytokine Biomarkers for Multiple Sclerosis Diagnosis

Mediators of Inflammation2020
View

Identified predictive biomarker signatures using statistical modeling and feature analysis.

Portfolio

Each project is a unique piece of development 🧩

setup

Algorithmic ML Trading Strategy

  • Backtested a model achieving a 29.48% ROI for ETF stock prediction.
techtechtech
setup

Credit Card Fraud Detection

  • Explainable AI-based model using AutoML and NLP techniques.
techtechtech
setup

GenAIJudge

  • GPT-4 powered policy evaluation tool for sustainability efforts.
techtechtech
setup

MTA-Commute-Pal

  • Optimized travel with predictive modeling using NYC MTA data.
techtechtech

Contact

Don't be shy! Hit me up!

New York, NY

Built with ❤️ & ☕️ by Shikhar Johri