Open to: AI Engineer  ·  Gen AI Engineer  ·  Prompt Engineer

Karthik Vadla

I build production-ready LLM systems — from RAG pipelines to NL2SQL backends — that solve real problems, ship clean APIs, and actually work at scale.

2 Production AI systems
80% Accuracy on NL2SQL
<2s RAG retrieval latency
1K+ Document chunks indexed
Portrait of Karthik Vadla
Hyderabad, India
+91 8106910205
vadlakarthik9876@gmail.com
B.Tech CS — 2024

I'm a Generative AI Engineer with hands-on experience shipping LLM-powered backends — not just running notebooks. My work spans RAG architectures, NL2SQL systems, semantic search pipelines, and production-grade FastAPI applications built around real reliability constraints.

I'm drawn to the hard parts: getting retrieval latency under 2 seconds, pushing NL2SQL accuracy past 80%, making LLM outputs actually trustworthy. I use Python, FastAPI, LangChain, Pinecone, and the major LLM APIs (Gemini, OpenAI, Groq) daily — and I understand the full stack from vector embeddings to REST contract design.

Backed by certifications in Prompt Engineering and Gen AI from Cognitive Class and Tata. Looking for a team where I can go deeper on LLM system design and own real production impact.

Education
B.Tech — Computer Science
Kasireddy Narayana Reddy College of Engineering & Research
2020 – 2024  ·  CGPA 7.34
Certifications
Google Cloud Skills Boost  ·  Aug 2025
Tata Group  ·  Sep 2025
Cognitive Class  ·  2026
Cognitive Class  ·  Feb 2025
Hugging Face  ·  Feb 2025
LLM / Gen AI Stack
LangChain RAG Prompt Engineering Embeddings Vector Search Agentic AI NL2SQL Groq
LLM APIs & Models
Google Gemini OpenAI GPT Claude API Hugging Face Sentence Transformers Vanna 2.0
Languages & Frameworks
Python FastAPI SQL JavaScript RESTful API Design Gradio
ML / Data
TensorFlow PyTorch Scikit-Learn Pandas NumPy Plotly
Infrastructure & Tools
Pinecone SQLite AWS Git / GitHub CI/CD Jupyter VS Code
Concepts
Neural Networks NLP Supervised Learning SDLC Agile Secure Coding
Q&A Retrieval-Augmented Generation (RAG) System
Python FastAPI Pinecone Sentence Transformers Groq LLM Gradio
Problem
Keyword-based search fails on large document sets — users get no answers, or wrong ones. The challenge: build semantic retrieval that's fast, accurate, and cites its sources.
Approach
Designed a modular RAG pipeline on FastAPI + Pinecone, indexing 1,000+ document chunks (300–500 tokens each). Implemented top-3 semantic retrieval via Sentence Transformers, with confidence scoring and source attribution before passing context to Groq LLM for generation.
Results
<2s semantic retrieval latency
1,000+ document chunks indexed
25 test queries with validated confidence scoring
AI-Powered NL2SQL System
Python FastAPI Vanna 2.0 Gemini API SQLite Plotly
Problem
Non-technical users can't query databases directly. Writing SQL requires expertise; dashboards are rigid. The goal: let anyone ask questions in plain English and get real data back instantly.
Approach
Built FastAPI endpoints that pipe natural language through Vanna 2.0 + Gemini API to generate SQL supporting filters, aggregations, and joins. Added SQL validation, structured output formatting, and Plotly-based visualization for immediate insight delivery on a 200+ record SQLite database.
Results
80% query accuracy across 20 structured test cases
200+ records queryable via plain English
5→ query types supported: filters, aggregations, joins
AI-Driven Story & Script Generation System
Python Google Gemini API LLM Orchestration Prompt Engineering
Problem
Generic LLM outputs lack narrative structure and character consistency across scenes. Writers need tools that maintain memory, genre context, and story arc — not just one-shot text generation.
Approach
Engineered a genre-aware generation system using Gemini API with custom prompting frameworks for sci-fi, horror, and fantasy. Built scene-to-scene chaining with character memory modules, enabling iterative editing and branching narrative paths via a clean Gradio UI.
Key Outcomes
Demonstrated multi-provider LLM integration (OpenAI, Gemini, Claude) with abstracted API contracts — showing production-level thinking on vendor flexibility. Delivered working demo pipeline for downstream visual storytelling workflows.
Communication
Quick Email Fixer
Rewrites any draft email to be concise, clear, and action-oriented — under 100 words.
Act as an experienced professional email copywriter with a strong background in business communication, clarity, and tone optimization. Context: I will provide you with a draft email that may be unclear, too long, awkwardly phrased, or lacking a strong call-to-action. Task: Rewrite the email to improve clarity, tone, and effectiveness while keeping the original intent intact. Instructions: - Reduce unnecessary words and remove redundancy - Keep the total length under 100 words unless absolutely necessary - Maintain a professional but friendly and human tone - Ensure logical flow from opening to closing - Add or improve a clear call-to-action - Fix grammar, punctuation, and structure issues Output Format: 1. Subject line (if missing, create one) 2. Rewritten email Email: [PASTE EMAIL HERE]
Ideation
Idea Expander
Turns one vague or rough idea into three distinct, practical, well-structured concepts.
Act as a creative strategist skilled in turning rough ideas into clear, usable concepts. Context: I will provide a vague or early-stage idea that lacks structure or direction. Task: Expand this into three distinct, practical concepts. Instructions: - Ensure each concept is meaningfully different - Keep each concept 1–2 sentences - Focus on clarity and usefulness - Avoid abstract or overly complex phrasing - Stay true to the core idea while improving it Style: Clear, grounded, and easy to understand. Output Format: - Concept 1: - Concept 2: - Concept 3: Idea: [PASTE IDEA HERE]
Engineering
Code Simplifier
Refactors working but messy code into clean, readable, maintainable code — preserving all behavior.
Act as a senior software engineer with expertise in clean, maintainable code. Context: I will provide code that works but may be overly complex or hard to read. Task: Simplify the code without changing its functionality. Instructions: - Preserve exact behavior and output - Reduce complexity and redundancy - Improve naming if needed - Follow best practices for the language - Avoid unnecessary abstractions - Keep it readable and maintainable Additionally: - Briefly explain what was improved Output Format: 1. Simplified code 2. Short explanation (2–4 bullet points) Code: [PASTE CODE HERE]
Data Analysis
Data Insight Finder
Extracts the top 3 decision-relevant insights from any dataset in plain English — no jargon.
Act as a data analyst who explains insights clearly to non-technical audiences. Context: I will provide data that needs interpretation. Task: Identify the most important insights. Instructions: - Extract the top 3 insights - Use plain English (no jargon) - Focus on meaningful patterns or trends - Avoid obvious observations - Prioritize decision-relevant insights Style: Simple, clear, and direct. Output Format: - Insight 1: - Insight 2: - Insight 3: Data: [PASTE DATA HERE]
Writing
Content Rewriter
Rewrites any awkward or robotic paragraph to sound natural, smooth, and human — same meaning, better flow.
Act as a skilled editor focused on clarity, tone, and readability. Context: I will provide a paragraph that may sound awkward or unnatural. Task: Rewrite it to sound more human and smooth while keeping the meaning unchanged. Instructions: - Preserve the original meaning exactly - Improve flow and sentence structure - Replace awkward phrasing - Keep tone conversational but polished - Maintain similar length (±10–15%) - Do not add or remove key ideas Style: Natural and human, not robotic. Output Format: - Rewritten version only Text: [PASTE TEXT HERE]

Let's build something that ships.

I'm actively looking for Generative AI Engineer roles where I can own real systems — not just demos. If you're working on LLM infrastructure, RAG pipelines, or AI products, I'd love to talk.

vadlakarthik9876@gmail.com