open to full-time roles — USA, will relocate

Dev Parikh.

Software Engineer — Backend & Applied AI · Qualcomm

I build production AI infrastructure — distributed Python microservices, RAG pipelines, LLM evaluation frameworks, and real-time model monitoring on AWS and Kubernetes. 4+ years across backend and applied AI. Published in IEEE Access.

Dev Parikh.dev.jpg
Qualcomm
Current role
IEEE Access
Published 2025
M.S. Data Science
UMBC · Dec 2025
// selected work
001 · 2025 · RAG / pgvector

Context QA — RAG Question Answering

document Q&A with grounded answers
Problem

Teams can't trust LLM answers over their own documents without grounding — and naive RAG gets slow as corpora grow.

Approach

Hybrid search (BM25 + dense embeddings) over pgvector with embedding-cache reuse; JWT auth and Stripe usage gating; provisioned on GCP with Terraform.

Result

p95 retrieval under 300ms on 5K+ page corpora — 100 concurrent users on a single Cloud Run instance.

PythonFastAPIpgvectorGCPTerraformStripe
002 · 2025 · LLM agents / AWS

Automated PR Review System

LLM code review on every pull request
private
Problem

Manual PR review burned hours per pull request — and security issues still slipped into Python and Terraform codebases.

Approach

AWS Bedrock with tool-calling behind API Gateway and Lambda, triggered by GitHub webhooks; Checkov and Semgrep run static security analysis alongside the LLM pass.

Result

100+ PRs reviewed at ~5 min average — 300+ quality and security issues caught before merge.

PythonAWS BedrockLambdaCheckovSemgrep
003 · '24

Travel Mitra — LangChain Agent

AI-powered hotel search assistant built with Airflow for data ingestion, FastAPI + LangChain + Gemini for the chatbot, and PostgreSQL + pgvector for review embeddings and hybrid retrieval over guest reviews.

004 · 2025 · ResearchIEEE Access

Multimodal Emotion Recognition

Transformer + CNN-LSTM fusion reaching ~97% on the DEAP benchmark, +4 points over prior baseline — published in IEEE Access (2025).

privatepaper
// project archive
16 entries
Talkwise - A scalable chat platformWebSocket chat — TypeORM, JWT, OpenAPI codegen'25
Real-Time KPI Dashboardsprivatelive metrics for 30+ business clients'23
Paywright - Microservice Scaffolding CLIMicroservice setup: full day → few seconds via "npx create-paywright-app"'23
Asset Management Platformprivatemedia delivery for 100K+ users'23
Soccer Player Valuation37.4% lower forecast error'24LAPD Crime Analysishotspot + trend analysis for LA'24Airline Performance Analysisdelay prediction across carriers'24
Portfolio Engineprivatethis site — CMS-driven content'25
LLM Fine-Tuning — Phi-3 QLoRAQLoRA fine-tune on Dolly 15k, MLflow tracking + ONNX serving'25Real-Time Face Detectionmulti-face tracking on live video'24
Telemetrix — Streaming Telemetryprivatedevice telemetry → Kafka → Spark structured streaming'25
Sensorimotor BCI Gameprivatereal-time EEG-controlled obstacle game'25
RAG Research ChatbotprivatePDF + URL Q&A with cited answers'24
Charming Stones — eCommerceprivatefull-stack jewelry storefront with auth + payments'24
Brain2Action2025Stripe Payment Microserviceproduction-ready payments scaffold — webhooks, CQRS'25
// experience
Jun 2025 — Present
CURRENT

Software Engineer — Backend & Applied AI

Qualcomm
USA · Baltimore, MD (remote)
+Cut enterprise documentation retrieval from hours to seconds — Dockerized FastAPI platform on AWS with Kubernetes-orchestrated EC2 auto-scaling, serving 50K+ documents through a Pinecone-backed RAG pipeline
+Eliminated manual health monitoring across 20+ edge AI deployments with SQS-backed log ingestion and automated drift detection, processing thousands of inference logs daily
+Owned GitHub Actions CI/CD quality gating for LLM releases — 500+ regression tests with LLM-as-judge scoring persisted to PostgreSQL, blocking ~12 regressions per quarter before production rollout
+Designed fleet observability for 10K+ heterogeneous hardware devices — Node.js Kafka consumers into MongoDB time-series collections, visualized in real time on Grafana
Oct 2024 — May 2025

Graduate Research Assistant

UMBC
Baltimore, MD
+Engineered a real-time EEG brain-computer interface in PyTorch — ~85% accuracy across 4 motor-imagery classes at ~150ms end-to-end latency
+Developed a multimodal transformer + CNN-LSTM fusion model — ~97% on the DEAP benchmark, +4 points over prior baseline — published in IEEE Access (2025)
Jan 2021 — Dec 2023

Software Engineer

Mindtree
India
+Led a junior team building a Django asset-management platform for a video streaming service — S3 multipart uploads, Celery-scheduled migrations, REST/GraphQL delivery for 100K+ users
+Delivered real-time KPI dashboards in React for 30+ business clients, powered by a Node.js SSE backend and visualized with Chart.js
+Reduced microservice setup from a full day to under an hour with a Node.js CLI scaffolding Express routes, Swagger/OpenAPI contracts, and Docker configs
// skills
all visible, no clicking required
Languages
PythonJavaScriptTypeScriptHTML/CSSSQL
Backend
FastAPIDjangoNode.jsExpressRedisKafkaCeleryGraphQL
AI / ML
RAGLangChainLangGraphFAISSpgvectorPyTorchLoRA
Cloud & DevOps
AWSDockerKubernetesTerraformGitGitHub ActionsPrometheus
Databases
PostgreSQLMySQLMongoDB
Frontend
React.jsNext.jsRedux Toolkit
Testing & Observability
PyTestJestPostmanMLflowEvidently AI
// education & credentials
Graduated Dec 2025

M.S. Data Science

University of Maryland, Baltimore County
Baltimore, MD
Graduated May 2023

B.E. Computer Engineering

Gujarat Technological University
Ahmedabad, India
PublicationIEEE Access · 2025

Multimodal Emotion Recognition — Transformer + CNN-LSTM Fusion

~97% accuracy on the DEAP benchmark, +4 points over the prior baseline.

D. Parikh, et al.IEEE Access · 2025
// beyond the work

Who's behind the commits?

I came into machine learning from full-stack engineering, which shapes how I build AI systems. I treat models as production dependencies: versioned, evaluated, monitored, and improved with evidence instead of vibes. I read the eval results before celebrating the demo, and I write documentation because future me is usually the next engineer on call.

What I enjoy most is building the layer between an impressive prototype and a system people can actually use: clean APIs, reliable pipelines, retrieval quality, observability, and thoughtful user experience. I learn fastest by shipping projects slightly beyond my comfort zone, debugging the hard parts, and turning those lessons into better systems.

CurrentlyLLM serving & evaluation infrastructure at Qualcomm
Learninginference optimization — quantization, speculative decoding
Off the clocksoccer, fifa, day hikes, and over-engineering my home server
// contact

Let's talk.

The fastest way to reach me is email — I reply within a day. Open to full-time AI/ML and software engineering roles anywhere in the US.

+1 (667) 433-9710
available — can start immediately