Shreyash Pandey // Software Development Engineer 2 at IBM

I build AI systems, automation workflows, and backend products that survive production

My recent work sits where applied AI meets engineering discipline: locator auto-healing, semantic retrieval, internal copilots, multilingual model training, and automation pipelines with real operational guardrails.

  • 5K+ test locators auto-healed
  • 3-tier ML plus VLM recovery design
  • Hackathon-winning semantic search
  • Open-source model training from scratch

Current proof surface

The strongest signal in the portfolio is not a single project. It is the combination of AI recovery design, service reliability, and hands-on model building shipped across work and independent systems.

Recovery ladder
3 tiers
CSS selectors first, then embeddings, then IBM Granite 3.3 VLM when ambiguity stays high.
Runtime scale
4K-5K rpm
SAT runtime decomposed into four services with sub-second latency and release-grade operational constraints.
Model track
125M
Phoenix 125M plus multilingual pretraining work built with custom tokenizers and training pipelines.
5K+ test locators auto-healed3-tier ML plus VLM recovery designHackathon-winning semantic searchOpen-source model training from scratch
5K+ test locators auto-healed
4 microservices in the SAT runtime
83% Chrome accuracy in fallback mode
70% fewer internal support tickets
99.99% availability supported by prediction
1st place in India at TechInterrupt

What I optimize for

The common thread across my work is reliability under ambiguity. I like systems that need to reason, recover, and still stay operationally legible.

Start with the cheapest reliable fallback

I design systems to attempt deterministic recovery first, then graduate into ML and model-based fallbacks only when they are justified.

Treat evaluation as part of the product

When the system contains AI, the measurement loop is not optional. I care about observable accuracy, drift, error budgets, and failure analysis.

Systems I have shipped

2024 - Present

Software Development Engineer 2

IBM Software Labs · Bengaluru, India

I design and ship reliability-heavy AI capabilities inside browser automation and testing products. The work combines embeddings, vision-language models, service decomposition, and a lot of operational discipline.

2023 - 2024

Software Engineer 2

Software AG (now IBM) · Bengaluru, India

This phase pushed me deeper into AI product work: semantic retrieval, internal copilots, and prediction systems grounded in practical product needs rather than demos.

2022 - 2023

Software Engineer

Software AG · Bengaluru, India

I worked on enterprise integration platform capabilities across Java, Spring Boot, and REST APIs, building the foundation that still shapes how I reason about production systems.

Current projects

2026 · Decoder-only language model

Phoenix 125M

A LLaMA-style 125M parameter model trained from scratch on a single RTX 3080 Ti with a custom tokenizer, data pipeline, and training loop.

PyTorchTransformersTokenization
View model card

2026 · Multilingual language models

Sweta-Hi and Sweta-Kn

Hindi and Kannada pretraining efforts built on a LLaMA-style architecture with custom tokenizers and an end-to-end multilingual data pipeline.

Multilingual NLPData engineeringCustom tokenizers
View model card

2026 · Fine-tuning · Text-to-SQL

SQLForge: Mistral 7B QLoRA

A 4-bit QLoRA fine-tune that turns Mistral 7B v0.3 into a reliable text-to-SQL model. The same 12 GB GPU used for Phoenix 125M, with a 3.75 GB VRAM headroom budgeted up front, and a schema-aware evaluation rebuild after the first WikiSQL run showed the metric was lying.

QLoRAFine-tuningbitsandbytes
View model on HuggingFace

See all builds

Explore the full project portfolio

Looking for my next role

AI Systems Engineer based in Bengaluru, India

I am most interested in roles where AI systems, backend engineering, and reliability work intersect. That usually means agent infrastructure, evaluation-heavy product work, automation platforms, or developer tooling.