Senior AI/ML Engineer

Aditya Mahakali

Building prodcution AI systems with reliable backends and governed AI.

Generative AIMachine LearningAI SystemsEnterprise Delivery

Backend architecture, secure APIs, and retrieval-first grounded GenAI experiences.

My Skills & Expertise

Machine learning and full-stack software development.

Machine LearningDeep LearningGenerative AINLPComputer VisionAI SearchRetrieval (RAG)AgentsKnowledge GraphsEmbeddingsRe-rankingMachine LearningDeep LearningGenerative AINLPComputer VisionAI SearchRetrieval (RAG)AgentsKnowledge GraphsEmbeddingsRe-rankingMachine LearningDeep LearningGenerative AINLPComputer VisionAI SearchRetrieval (RAG)AgentsKnowledge GraphsEmbeddingsRe-ranking
Spring BootDjangoFastAPIFlaskNodeJSSQLNoSQLGraphQLElasticLinuxGitDockerOpenShiftAnsibleSpring BootDjangoFastAPIFlaskNodeJSSQLNoSQLGraphQLElasticLinuxGitDockerOpenShiftAnsibleSpring BootDjangoFastAPIFlaskNodeJSSQLNoSQLGraphQLElasticLinuxGitDockerOpenShiftAnsible

Experience

Software Development Intern

Hughes Systique Corporation, Gurugram

Jan 2023 - Aug 2023

  1. Learning Phase

    Full-stack software development

    Hands-on progression through backend, frontend, delivery, and app security.

    Spring Boot -> Angular -> Docker -> Security
  2. Pilot Project

    Built BugPilot

    Built a company-internal bug tracker to report, triage, and track issues across projects; adopted by multiple internal teams.

    Spring Boot, React, DockerAdopted by 3 internal teams

AI/ML Engineer

IBM, Bangalore

Aug 2023 - Apr 2026

  1. Start

    Conversational RAG (Banking MVP)

    Built a conversational RAG assistant with query modification + hybrid retrieval and integrated it into a React UI with deep linking.

    watsonx.ai, Watson Assistant, React, Flask, Docker, Milvus85% first-call resolution
  2. Search

    FDA product search + assistant (Life Sciences)

    Summarized FDA documents to create high-signal metadata and improved retrieval quality for a large US retailer.

    Solr, RAG, summarizationRelevance +25%
  3. Scale

    Technical QA assistant over 100k+ documents

    Built RAG APIs over a 100k+ corpus with hybrid dense/sparse/BM25 retrieval, domain query modification, and tuned embeddings.

    Elasticsearch, dense/sparse search, custom embeddings56% -> 77% answer accuracy
  4. Graphs

    Knowledge-graph RAG (VKG)

    Automated ontology creation and built NL2Cypher retrieval pipelines for Neo4j-backed RAG.

    Neo4j, NL2Cypher, knowledge graphs

Independent Consulting AI Engineer

Independent

Apr 2026 - Present

  1. Consulting

    Independent AI Consulting

    Building custom AI solutions, RAG systems, and agentic workflows for enterprise clients.

    LLMs, RAG, Agents, Full-Stack AI

Featured Open Source

Veridex

Modular, probabilistic, and research-grounded AI content detection.

PythonAI DetectionProbabilisticMulti-modal
Detects AI-generated text, image, and audio.
Uses confidence scores instead of binary output.
Designed for research and production workflows.

A production-ready library for detecting AI-generated content across text, image, and audio with confidence estimates and interpretable signals.

Featured Projects

Enterprise RAG

IBM - ElasticSQL

Hybrid retrieval that combines vectors with SQL-like querying to support networking workflows.

ElasticsearchHybrid retrievalRAG APIsGuardrails
View details

Highlights

  • Unified dense + structured retrieval for operational queries.
  • Designed for reliability under enterprise constraints.

Architecture

  • Ingestion: document normalization, chunking, and metadata enrichment.
  • Indexing: hybrid dense/sparse signals + structured fields for filtering.
  • Retrieval: query parsing -> hybrid retrieval -> ranking -> context assembly.
  • Generation: answer synthesis with citations and guardrails.

Model + quality work

  • Retrieval tuning: hybrid weighting, filters, and ranking features based on evaluation outcomes.
  • Prompting: citation-first answering and refusal patterns for low-confidence contexts.
  • Embeddings: model selection + chunk sizing tuned to networking-style documents.

Evaluation

  • Offline: retrieval recall/precision on curated query sets.
  • Online: user feedback loops and failure-mode review (hallucination, mismatch).

Deployment

  • API-first backend with observability hooks (latency, retrieval hit-rate).
  • Security: RBAC-ready integration and input sanitization patterns.

Full stack

ElasticsearchHybrid retrievalRAG APIsGuardrails

Conversational AI

IBM - Banking Virtual Assistant

Conversational RAG assistant with query modification, hybrid search, and UI integration.

watsonx.aiWatson AssistantReactFlask
View details

Highlights

  • Intent-based flows with grounded answers for self-service.
  • Deep linking from answers into product/knowledge pages.

Architecture

  • Channel/UI -> assistant orchestration -> retrieval service -> LLM answer synthesis.
  • Chunking: parent-child strategy for better grounding on long docs.
  • Retrieval: hybrid semantic + lexical with query rewriting and filters.

Model + quality work

  • Conversation quality: prompt patterns for tool use, grounded answers, and consistent tone.
  • Retrieval quality: parent-child chunking + query rewriting heuristics for intent-specific recall.
  • Safety: prompt-injection aware prompting + constrained tool behavior.

Evaluation

  • Conversation-level QA: groundedness checks and citation coverage.
  • Task success tracking via intent completion and fallback rates.

Deployment

  • Containerized services for backend + UI; environment-specific configs.
  • Safety: prompt-injection aware prompting + restricted tools/actions.

Full stack

watsonx.aiWatson AssistantReactFlaskDocker

Search & Summarization

IBM - Product Search + Assistant (FDA Docs)

Search + assistant over FDA documents using LLM summarization to enrich metadata and improve retrieval.

SolrRAGSummarizationMetadata indexing
View details

Highlights

  • Metadata summarization to improve searchability and ranking signals.
  • Assistant layer on top of search for guided drug/product queries.

Architecture

  • Ingestion: parse FDA documents -> sectioning -> metadata extraction.
  • Summarization: generate structured product summaries as searchable metadata.
  • Search: rank using lexical + metadata fields; assistant uses search as tool.

Model + quality work

  • Summarization: structured prompting + schema validation to keep metadata consistent.
  • Search tuning: field boosts and ranking rules leveraging summary metadata.
  • Risk control: safer phrasing + guardrails for regulated/medical content.

Evaluation

  • Search relevance evaluation on representative queries (A/B on metadata).
  • Assistant evaluation: answer grounding and incorrect-drug risk checks.

Deployment

  • Search service + assistant APIs; guarded outputs for regulated content.
  • Monitoring: drift checks on summarization schema adherence.

Full stack

SolrRAGSummarizationMetadata indexing

NL2SQL

IBM - Retail NL2SQL

Natural language to SQL system with metadata automation, disambiguation, RBAC, and injection defenses.

FastAPIwatsonx.aiRBACSQL validation
View details

Highlights

  • Business users query data safely without writing SQL.
  • Guardrails for RBAC + prompt-injection resistance.

Architecture

  • Metadata dictionary: automated schema understanding + synonyms.
  • Pipeline: NL -> intent/slots -> SQL draft -> validation -> execution -> explanation.
  • Safety: RBAC enforcement + allowlisted tables/columns + query constraints.

Model + quality work

  • Schema linking: metadata dictionary + synonyms to reduce ambiguity and improve grounding.
  • Accuracy tuning: curated few-shot examples + error-driven prompt iteration.
  • Safety: allowlists, query validation, and injection defenses before execution.

Evaluation

  • SQL accuracy: execution match + result correctness on benchmark queries.
  • Security tests: injection attempts, privilege escalation, schema leakage.

Deployment

  • Microservice API with audit logs for generated SQL and user identity.
  • Latency tuning: caching metadata and reusing compiled prompts/templates.

Full stack

FastAPIwatsonx.aiRBACSQL validationGuardrails

Medical RAG

Personal - MedBot HyDe

HyDE-style retrieval to improve medical QA by generating hypothetical documents for better recall.

RAGHyDEEmbeddingsReranking
View details

Highlights

  • Improves retrieval coverage for sparse or ambiguous queries.
  • Designed to be testable with offline evaluation sets.

Architecture

  • Query -> hypothetical doc generation -> embedding -> retrieval -> rerank -> answer.
  • Strict separation between retrieval augmentation and final answer generation.

Model + quality work

  • HyDE prompting: controlled hypothetical generation to boost retrieval recall without hallucinated final answers.
  • Embedding + rerank: model selection and reranking strategy tuned for medical relevance.
  • Safety: refusal patterns and disclaimers to avoid medical advice overreach.

Evaluation

  • Compare baseline RAG vs HyDE on retrieval metrics and answer quality.
  • Failure analysis: misleading hypotheticals and unsafe medical outputs.

Deployment

  • Batch evaluation harness + lightweight API wrapper for demos.
  • Safety: disclaimers and refusal behavior for medical advice.

Full stack

RAGHyDEEmbeddingsReranking

Agents

Personal - Daily Paper Summarizer

Agentic pipeline to ingest new papers and produce daily structured summaries and takeaways.

AgentsLLMsPipelinesScheduling
View details

Highlights

  • End-to-end automation: fetch -> filter -> summarize -> publish.
  • Separation of agents: retrieval, summarization, quality checks.

Architecture

  • Source ingestion -> relevance filter -> multi-agent summarization -> final editor pass.
  • Tooling: per-agent prompts and deterministic output schemas.

Model + quality work

  • Quality: agent role prompts + schema enforcement to reduce drift and improve consistency.
  • Cost/latency: step budgeting and retry strategy for flaky sources/tools.
  • Faithfulness: citation-first summaries and checks for unsupported claims.

Evaluation

  • Quality rubric: coverage, faithfulness, and novelty of takeaways.
  • Latency/cost tracking per agent step to optimize throughput.

Deployment

  • Scheduled runs + artifact storage for summaries and references.
  • Retry strategy for flaky sources and tool failures.

Full stack

AgentsLLMsPipelinesScheduling

Certifications & Badges

Show Certifications & Badges

Education

Academic grounding in computer science, mathematics, and applied ML.

JNUMCAGATE CSUGC NET

MCA, Computer Science and Applications

Jawaharlal Nehru University (2021–2023) · 7.65/9 CGPA

BSc, Computer Science

Central University of Rajasthan (2018–2021) · 7.74/10 CGPA

Qualified: GATE CS, UGC NET (CS)

Interests

Outside work: strategy, teaching, and building systems end-to-end.

Chess

Strategy, calculation, and calm decision-making.

Teaching

Explaining complex systems with clarity and structure.

Coding

Shipping end-to-end builds: API, retrieval, and UI.

  • I enjoy mentoring, reviewing designs, and improving team delivery quality.
  • I prefer systems that are observable, secure-by-default, and built for real users.
  • I like turning research ideas into working products with clear metrics.

Contact

Open to collaborations, product builds, and AI engineering roles.

© 2026 Aditya Mahakali. All rights reserved.