Drive reliability and operational excellence across Toku’s cloud infrastructure platform leveraging AWS while leading a growing infrastructure team in a strategic and operational capacity.
At Toku, we create bespoke cloud communications and customer engagement solutions to reimagine customer experiences for enterprises. We provide an end-to-end approach to help businesses overcome the complexity of digital transformation and deliver mission-critical CX through cloud communication solutions. Toku combines local strategic consulting expertise, bespoke technology, regional in-country infrastructure, connectivity, and global reach to serve the diverse needs of enterprises operating at scale. Headquartered in Singapore, Toku supports customers across APAC and beyond, with a growing footprint across global markets.
This is a senior leadership role responsible for owning and scaling Toku’s cloud and infrastructure platform, with AWS at its core. You will drive reliability, security, scalability, and operational excellence across a globally distributed, mission-critical environment, while building strong processes and leading a growing infrastructure team. This role is both strategic and operational, requiring strong leadership combined with the ability to guide hands-on technical direction. You will be a great fit for this role if you can build structure, lead teams, and elevate infrastructure maturity in a fast-growing environment.
What you will be doing
Cloud architecture & platform strategy: Define and own the long-term cloud infrastructure strategy with AWS as the primary platform, ensuring scalability, resilience, and alignment with business growth.
Highly available systems design: Design and oversee fault-tolerant, multi-region, and multi-environment (production, staging, disaster recovery) architectures supporting mission-critical systems.
AWS platform ownership: Own and standardise AWS architecture, account structure, networking, IAM, and core infrastructure patterns across environments.
Infrastructure reliability & performance: Drive capacity planning, performance optimisation, and reliability engineering to support low-latency, high-throughput workloads.
Operational excellence & incident management: Own uptime, SLAs and SLOs, and lead incident response, root cause analysis, and continuous improvement of system resilience.
Security & compliance leadership: Ensure strong security posture across infrastructure, implementing security-by-design principles and supporting compliance initiatives such as ISO certifications.
Cloud cost optimisation (FinOps): Drive cost governance, budgeting, and forecasting, ensuring efficient resource utilisation without compromising reliability.
Engineering & services collaboration: Partner with Engineering and Services teams to enable reliable deployments, improve CI/CD pipelines, and support customer onboarding and production readiness.
Infrastructure modernisation: Lead adoption of containerisation, Infrastructure as Code, and automation-first practices across the platform.
Process & governance: Establish and improve infrastructure processes, standards, change management practices, and operational playbooks.
Documentation & operational readiness: Ensure runbooks, documentation, and operational procedures are maintained and consistently followed.
Leadership & team development: Lead, mentor, and scale infrastructure, cloud, and platform engineering teams, driving accountability and high performance.
Cross-functional influence: Act as a key stakeholder in architectural reviews and collaborate with senior leadership to shape infrastructure direction.
We’d love to hear from you if you have
Experience level: 10+ years of experience in infrastructure, cloud, or platform engineering roles, with demonstrated progression into leadership positions.
Team leadership: Proven experience managing and scaling infrastructure or cloud engineering teams, including mentoring and performance management.
AWS expertise (core requirement): Extensive hands-on experience with AWS and cloud-native architectures in production environments.
AWS services ownership: Deep ownership of AWS services including (but not limited to) EC2, ECS/EKS, ALB/NLB, VPC, IAM, S3, RDS, DynamoDB, CloudFront, Route53, KMS, CloudWatch, and security services.
Scalable systems: Strong background in building and operating highly scalable, reliable, and secure production systems.
Infrastructure as Code & automation: Hands-on experience with Terraform or similar tools, and driving automation-first infrastructure practices.
Containers & deployment models: Strong understanding of containerisation, orchestration, and modern deployment patterns.
Observability: Experience designing and operating monitoring, logging, alerting, and observability frameworks.
Security & compliance: Solid understanding of cloud security, IAM, encryption, and experience supporting compliance or audit processes (e.g., ISO 27001).
Engineering collaboration: Experience working closely with engineering and services teams to improve deployment quality and operational reliability.
Process-driven leadership: Proven ability to introduce and enforce engineering processes, standards, and operational discipline.
Cost optimisation: Experience managing cloud costs, budgeting, and FinOps practices.
Domain experience (preferred): Experience supporting large-scale, customer-facing SaaS or enterprise platforms.
Location: This role is to be based in Malaysia – KL preferred. It will operate on a mostly WFH basis for the time being, but in the future may require hybrid model WFH / WFO in our KL Sentral based office.
What would you get?
Training and Development
Discretionary Yearly Bonus & Salary Review
Healthcare Coverage based on location
15 days Paid Annual Leave, plus other leave allowances
Toku has been recognised as a LinkedIn Top Startup and by the Financial Times as one of APAC’s Top 500 High Growth Companies. If you’re looking to be part of a company on a strong growth trajectory while working on meaningful, real-world challenges, we’d love to hear from you.
Toku builds customized cloud communications and customer engagement solutions specifically designed for enterprises navigating digital transformation. With a focus on delivering seamless customer experiences, Toku integrates local consulting, unique technology, and regional infrastructure to effectively meet the diverse and complex needs of businesses operating at scale across APAC and beyond.
Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!