At Toku, we create bespoke cloud communications and customer engagement solutions to reimagine customer experiences for enterprises. We provide an end-to-end approach to help businesses overcome the complexity of digital transformation and deliver mission-critical CX through cloud communication solutions. Toku combines local strategic consulting expertise, bespoke technology, regional in-country infrastructure, connectivity, and global reach to serve the diverse needs of enterprises operating at scale. Headquartered in Singapore, Toku supports customers across APAC and beyond, with a growing footprint across global markets.
As a Founding AI Engineer, you will lead the development of our speech recognition capabilities, including contributing to open-source models optimised for APAC languages and telephony environments. You will own the entire machine learning pipeline from model architecture through to deployment, and publication on Hugging Face and GitHub. This is a unique opportunity to build technology that will serve billions of people across the Asia-Pacific region and beyond.
What you will be doing
Model Development & Training
Design and implement telephony-optimised speech recognition models for APAC languages (English variants, Mandarin, Thai, Vietnamese, Indonesian, and more)
Develop comprehensive AI model training frameworks using PyTorch on local and cloud GPU infrastructure
Create and optimise data augmentation pipelines addressing telephony-specific challenges (8kHz audio, codec artefacts, background noise, SNR optimisation)
Build models that handle code-switching common in APAC contexts (Singlish, Hinglish, Taglish)
APAC-Specific Optimisation
Address tonal language challenges for Mandarin, Thai, Vietnamese, and other tonal languages
Optimise for regional accent variations across target markets
Develop evaluation benchmarks specific to APAC telephony contexts, including SNR and audio quality metrics
Implement techniques for low-resource language support
Infrastructure & Deployment
Build scalable inference systems for real-time and batch processing
Create containerised applications for model demonstration and testing
Develop APIs for integration with telephony systems
Deploy models on local and cloud GPU infrastructure
Integrate with Toku's existing Llama 8B deployment for language model capabilities
Open-Source Contribution (Future)
Contribute to the preparation of open-source releases
Write comprehensive technical documentation and user guides
Conduct performance benchmarking and validation studies
Contribute to the broader speech recognition community through publications and presentations
We’d love to hear from you if you have
Required Qualifications
Bachelor's or Master's degree in Computer Science, Engineering, or related technical field with strong ML foundations
1-3 years of hands-on experience in machine learning projects
Excellent Python programming skills
Experience with PyTorch and deep learning model training
Proficiency in handling large datasets and data preprocessing
Understanding of speech processing concepts and techniques
Experience with cloud platforms and GPU computing
Familiarity with containerisation (Docker) and deployment practices
Preferred Qualifications
Portfolio of AI projects (open-source contributions highly valued)
Familiarity with OpenAI Whisper and transformer-based architectures
Previous experience with speech-to-text or audio processing projects
Experience with open-source project development and collaboration
Strong technical writing and documentation skills
Familiarity with at least one APAC language's phonological characteristics
Understanding of telephony audio characteristics (8kHz sampling, codec artefacts, SNR considerations)
Publication history in speech recognition or related fields
Personal Attributes
Independent and ownership-driven: ability to take projects from conception to completion
Growth-oriented: enthusiasm for learning new technologies
Quality-focused: commitment to robust, well-documented code
Strong communication and presentation skills
Location:
This is a remote / hybrid role to be based in either Singapore, Hong Kong or the Netherlands (Rotterdam preferred)
Why join Toku?
Mission-Driven Impact: Contribute to democratising speech AI for APAC's diverse linguistic landscape
Open-Source Leadership: Build your reputation through contributions to bespoke model development
Technical Growth: Work with experienced engineers on state-of-the-art speech AI technologies
Regional Expertise: Become a specialist in an underserved but massive market
Autonomy: Take ownership of significant technical challenges with support to succeed
Benefits and Perks: Training and development, annual bonus and salary review, healthcare coverage based on location, 20 days Paid Annual Leave plus other leave allowances, and more
Toku has been recognised as a LinkedIn Top Startup and by the Financial Times as one of APAC’s Top 500 High Growth Companies. If you’re looking to be part of a company on a strong growth trajectory while working on meaningful, real-world challenges, we’d love to hear from you.
Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!
Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.
AI Engineer Q&A's