Plivo is a leading technology company transforming customer engagement for some of the world’s largest B2C brands, including Uber, WhatsApp, and Zomato. Our new product - the AI agents platform, automates the entire customer lifecycle - from acquiring, engaging, and supporting customers - through cutting-edge multimodal AI, including LLMs, text-to-speech, and speech detection. With a 100+ member team based out of India & US. We are building high-impact global products that handle over 1 billion API requests per month. If you are excited about solving hard, real-world AI challenges at scale, this is where you belong Role overview This is a deep systems and multidisciplinary role that bridges real-time communications (RTC), VoIP infrastructure, backend systems, and AI model development. You’ll architect and build a distributed RTC platform across the globe, develop backend services, and integrate AI models into production-grade voice and multimodal experiences at scale. If you love low-latency systems, real-time voice engineering, and AI-driven innovation, this is where you belong. What You’ll Do

Design and build real-time voice systems using WebRTC, SIP/RTP and Websocket streaming.

Engineer backend infrastructure for signaling, routing, call control, and audio/video processing.

Work with open-source RTC stacks -- Freeswitch, Kamailio, Livekit, RTPEngine, and Pipecat.

Develop and integrate AI capabilities, including: TTS (Text-to-Speech), STT (Speech-to-Text), VAD (Voice Activity Detection), Media servers and AI voice agentsBuild and scale a global, distributed RTC platform with strong resilience, observability, and low latency.I ntegrate AI/ML models into real-time voice systems (speech recognition, synthesis, embeddings).

Build and scale a global, distributed RTC platform with strong resilience, observability, and low latency.

Work across the stack -- from C/Go/Rust real-time components to Python/Node.js backend services, and our SDKs.

Collaborate cross-functionally with Product, and DevOps teams.

Instrument and monitor systems for quality, latency, and performance.

Prototype rapidly: build, test, iterate, and deploy new RTC + AI features.

Contribute to open-source voice and AI ecosystems.

Be hands-on: Debug issues, tune queries, optimize performance, and improve resiliency, you own your code from dev to prod.

Don’t be afraid to jump on a call or chat with a customer to ensure they have a smooth experience -- you own the outcome, not just the code.

Use AI-assisted development tools to improve coding speed, testing, and code quality.

What You Bring

Strong foundation in systems programming -- C, Go, and/or Rust.

Experience in backend and real-time systems engineering.

Expertise in WebRTC, SIP, VoIP, and signaling/audio pipelines.

Hands-on with open-source RTC stacks: Freeswitch, Kamailio, Livekit, RTPEngine, Pipecat

Understanding of media negotiation, codec pipelines, and audio/video streaming.

Knowledge of real-time networking (UDP/TCP, ICE, NAT traversal).Experience building and scaling distributed RTC platforms.

Experience with AI voice systems -- TTS, STT, VAD, LLM voice agents, or speech embeddings.

Familiarity with AI/ML frameworks (PyTorch, TensorFlow, ONNX) or model integration.

Backend development experience in Python or Node.js.

Strong debugging, profiling, and performance optimization skills.

Builder mindset: proactive, curious, and thrives in complex systems.

Bonus Points

Contributions to open-source RTC or AI projects.

Familiarity with LLM integration and multimodal AI (voice + text).Experience in edge computing or real-time streaming optimization.

Exposure to audio signal processing or DSP algorithms.

Experience deploying real-time systems on cloud (AWS/GCP) with Docker/Kubernetes.

Experience with AI voice agents, or voicebots.

Software Development Engineer Voice

TLDR