Applied AI Engineer & Full-Stack Developer building offline-first AI systems, agentic workflows and high-performance RAG pipelines in Mumbai.
I'm an Applied AI Engineer based in Mumbai, working at the intersection of machine learning systems and scalable software architecture. My focus is building AI that works reliably in the real world — including in environments with no internet.
I transitioned from enterprise .NET development into AI engineering, which gives me an unusual edge: I understand both the infrastructure side (databases, APIs, deployment) and the ML side (embeddings, vector search, LLM orchestration).
Currently I specialise in deploying local LLMs via Ollama, building RAG pipelines with ChromaDB and SentenceTransformers, and engineering multilingual OCR systems using PaddleOCR — all optimised for CPU-only, air-gapped environments.
A fully offline Document Intelligence platform for secure natural language querying of private PDF repositories. Built with a custom RAG pipeline — no data ever leaves your machine.
Engineered citation-first retrieval with relevance percentage scores, stateful multi-turn dialogue via ASP.NET Core Session management, and automated resource-kill fail-safes for Ollama model switching.
The core challenge was running everything on a local system without a GPU. Benchmarked multiple LLMs before landing on a model optimised for low-spec systems — capable of running on any machine with a minimum of 8 GB RAM. Also built automated resource-kill fail-safes to handle Ollama model switching without memory leaks or process lock-ups.
Cross-platform text extraction system supporting nearly all Indian languages. Optimised for CPU-only inference, generating clean structured JSON output for downstream pipelines.
A Python tool to bulk-download historical stock data from the National Stock Exchange of India. Automates data collection for any NSE-listed equity, enabling efficient backtesting and quantitative analysis workflows.
NSE actively blocks bots from scraping their website. Bypassed their bot-detection layer through a combination of spoofed browser headers, session rotation to mimic real user behaviour, and randomised request delays to avoid rate-limit triggers — ultimately reverse-engineering their request flow to extract data reliably.
View on GitHub →Full-stack CRUD application for student administration with an integrated GPT-powered chatbot for automated administrative assistance. Built with C#, Bootstrap, jQuery, and PostgreSQL.
Open to full-time roles, freelance projects, and interesting collaborations in AI engineering and full-stack development.