About SmartNews
SmartNews is a leading global information and news discovery company dedicated to delivering quality information to the people who need it. Thanks to our unique machine-learning technology and relationships with more than 3,000 global publisher partners, we provide news that matters to millions of users.
Founded in 2012 in Tokyo, SmartNews also has offices in San Francisco, Palo Alto, New York, and Singapore.
If you share our vision and are passionate about our mission, we encourage you to apply!
The Team
The Core Systems Team is responsible for operating the core Kubernetes orchestration platform that hosts both stateless application services and stateful data platforms for the company's developer teams. They provide a self-service continuous integration and delivery platform designed to enhance deployment frequency and ensure change stability. Additionally, the team offers a robust observability platform, delivering comprehensive solutions for monitoring, logging, and tracing to support seamless development and operations.
Your Role and Responsibilities:
In this role, you will be responsible for building and maintaining mission-critical infrastructure that powers all of SmartNews' engineering operations. Your mission will be to:
- Drive reliability and scalability of our core Kubernetes platform, ensuring seamless operations for all engineering teams
- Champion and implement site reliability engineering practices across our infrastructure
- Design and maintain observability solutions that provide deep insights into our systems' health and performance
- Evolve our CI/CD platform to enable faster, more reliable deployments for development teams
- Collaborate with teams across the organization to understand their infrastructure needs and implement solutions that enhance developer productivity
This role offers a unique opportunity to work on large-scale infrastructure that directly impacts the productivity of all engineering teams and the stability of SmartNews' services globally!
Requirements
Minimum requirements
- Strong administration experience in container orchestration platforms, particularly Kubernetes (CKA certification preferred)
- Extensive experience with observability systems (monitoring, logging, and tracing) for cloud environments using CNCF tools (Grafana, Prometheus, Thanos, Jaeger)
- Proven expertise in maintaining self-service CI/CD platforms and implementing modern deployment practices (Trunk-Based development, GitOps, automated canary releases)
- Production-level experience with Golang
- Business-level communication skills in English
- Experience working in cross-functional, global engineering teams
Nice to have experiences/skills
- AWS certification (Associate level or above) with deep knowledge of fundamental AWS services
- Experience with Infrastructure-as-Code tools (Terraform, CloudFormation)
- Active involvement in Site Reliability Engineering (SRE) practices and Cloud Native Computing Foundation (CNCF) technologies
- Familiarity with AWS Well-Architected Framework
- Strong understanding of distributed systems, microservices, asynchronous processing, and event-driven architectures
- Experience with multi-region infrastructure deployment and management
Working condition
Click here or visit our careers site for more info.
Benefits
- All healthcare and social insurance required by the Japanese labor law, plus annual health check
- Visa sponsorship and overseas relocation support available for eligible candidates
Click here or visit our careers site for more info about our benefits.