Voice AI Agent Developer

AI overview

Develop voice AI agents that provide intelligent, natural-sounding experiences by integrating with backend systems and optimizing for performance and reliability.

Role Overview

We are looking for a Voice AI Agent Developer to design, build, and optimize conversational voice AI agents. You will work on developing intelligent, natural-sounding voice experiences that solve real user problems. This role requires hands-on technical expertise combined with a strong understanding of conversational design principles.

Responsibilities

  • Design, develop, and deploy voice AI agents for production environments
  • Build and fine-tune voice pipelines including speech-to-text, natural language understanding, dialogue management, and text-to-speech components
  • Integrate voice agents with backend systems, APIs, and third-party services
  • Optimize for latency, accuracy, and natural conversation flow
  • Develop and maintain testing frameworks to ensure voice agent quality and reliability
  • Collaborate with product and design teams to define conversational user experiences
  • Monitor agent performance, analyze conversation logs, and implement improvements based on user interactions
  • Stay current with advancements in voice AI, LLMs, and conversational AI technologies
  • 2+ years of industry experience in software development
  • Hands-on experience building and deploying voice AI agents or conversational AI systems
  • Proficiency in Python or similar programming languages
  • Experience with voice/speech technologies such as: Speech-to-Text (Whisper, Deepgram, Google STT, AWS Transcribe), Text-to-Speech (ElevenLabs, PlayHT, Amazon Polly, Google TTS), Voice AI platforms (Voiceflow, VAPI, Retell, Bland AI)
  • Working knowledge of LLMs and prompt engineering for conversational applications
  • Experience with dialogue management and conversation state handling
  • Familiarity with real-time audio streaming and WebSocket protocols
  • Strong debugging and problem-solving skills

Nice to have

  • Experience with telephony integrations (Twilio, SIP, VoIP)
  • Background in NLU/NLP techniques and intent classification
  • Experience fine-tuning or training speech models
  • Familiarity with cloud platforms (AWS, GCP, Azure)
  • Experience with RAG (Retrieval-Augmented Generation) for knowledge-grounded conversations
  • Understanding of conversation design best practices and VUI principles
  • Contributions to open-source voice or conversational AI projects

Technical Skills

  • Languages: Python, JavaScript/TypeScript
  • Voice Platforms: VAPI, Retell, Voiceflow, Bland AI, or similar
  • Speech Technologies: Whisper, Deepgram, ElevenLabs, PlayHT
  • LLM Frameworks: LangChain, OpenAI API, Anthropic API
  • Infrastructure: Docker, Kubernetes, cloud services
  • Databases: PostgreSQL, Redis, vector databases
  • This is an on-site role based out of Bangalore. 
  • The interview process will consist of four rounds:
  1. HR Screening Round – to understand your background, interests, and role fit.
  2. Backend Interview - Focused on backend fundamentals, APIs, and system thinking.
  3. AI / Voice AI Round – Deep dive into voice pipelines, LLMs, and building real-time conversational agents.
  4. Final Round – a conversation with the CTO to assess alignment and expectations.

Careers at Abstrabit Technologies Pvt Ltd. Find Great Talent with Career Pages. | powered by SmartRecruiters | Find Great Talent with a Career Page.

View all jobs
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Developer Q&A's
Report this job
Apply for this job