Voice AI Agent Developer

TLDR

Develop voice AI agents that provide intelligent, natural-sounding experiences by integrating with backend systems and optimizing for performance and reliability.

Role Overview

We are looking for a Voice AI Agent Developer to design, build, and optimize conversational voice AI agents. You will work on developing intelligent, natural-sounding voice experiences that solve real user problems. This role requires hands-on technical expertise combined with a strong understanding of conversational design principles.

Responsibilities

  • Design, develop, and deploy voice AI agents for production environments
  • Build and fine-tune voice pipelines including speech-to-text, natural language understanding, dialogue management, and text-to-speech components
  • Integrate voice agents with backend systems, APIs, and third-party services
  • Optimize for latency, accuracy, and natural conversation flow
  • Develop and maintain testing frameworks to ensure voice agent quality and reliability
  • Collaborate with product and design teams to define conversational user experiences
  • Monitor agent performance, analyze conversation logs, and implement improvements based on user interactions
  • Stay current with advancements in voice AI, LLMs, and conversational AI technologies
  • 2+ years of industry experience in software development
  • Hands-on experience building and deploying voice AI agents or conversational AI systems
  • Proficiency in Python or similar programming languages
  • Experience with voice/speech technologies such as: Speech-to-Text (Whisper, Deepgram, Google STT, AWS Transcribe), Text-to-Speech (ElevenLabs, PlayHT, Amazon Polly, Google TTS), Voice AI platforms (Voiceflow, VAPI, Retell, Bland AI)
  • Working knowledge of LLMs and prompt engineering for conversational applications
  • Experience with dialogue management and conversation state handling
  • Familiarity with real-time audio streaming and WebSocket protocols
  • Strong debugging and problem-solving skills

Nice to have

  • Experience with telephony integrations (Twilio, SIP, VoIP)
  • Background in NLU/NLP techniques and intent classification
  • Experience fine-tuning or training speech models
  • Familiarity with cloud platforms (AWS, GCP, Azure)
  • Experience with RAG (Retrieval-Augmented Generation) for knowledge-grounded conversations
  • Understanding of conversation design best practices and VUI principles
  • Contributions to open-source voice or conversational AI projects

Technical Skills

  • Languages: Python, JavaScript/TypeScript
  • Voice Platforms: VAPI, Retell, Voiceflow, Bland AI, or similar
  • Speech Technologies: Whisper, Deepgram, ElevenLabs, PlayHT
  • LLM Frameworks: LangChain, OpenAI API, Anthropic API
  • Infrastructure: Docker, Kubernetes, cloud services
  • Databases: PostgreSQL, Redis, vector databases
  • This is an on-site role based out of Bangalore. 
  • The interview process will consist of four rounds:
  1. HR Screening Round – to understand your background, interests, and role fit.
  2. Backend Interview - Focused on backend fundamentals, APIs, and system thinking.
  3. AI / Voice AI Round – Deep dive into voice pipelines, LLMs, and building real-time conversational agents.
  4. Final Round – a conversation with the CTO to assess alignment and expectations.

Abstrabit Technologies builds custom SaaS and AI-enabled systems designed to streamline and automate operations for high-growth businesses. Targeting Series A founders, operations leaders, and tech teams, we eliminate manual processes and technical debt, providing robust solutions that scale as businesses grow. Our focus on integrating AI and automating complex workflows sets us apart, making operational excellence attainable.

View all jobs
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Developer Q&A's
Report this job

This job is no longer available