- Experienced L3 SRE engineer based on business-critical SaaS application.
- Capacity to L3 across the full stack including infra backend and front-end, before escalation to engineering business unit.
- Capacity to automate SRE tools to provide proactive L3 support, close to our tech monitoring strategy.
- Capacity to work under business pressure for business critical applications.
- Capacity to communicate accordingly with L1,L2, Engineering, Product managers, leadership and end-users during troubleshooting.
Must have Skills: Kubernetes (Expert), Github Actions, Terraform (Expert), and AWS.
- Capacity to communicate accordingly.
- Experience with incident and problem management.
- Experience with multitenant applications.
- Solid understanding of networking concepts (TCP/IP, DNS, Routing, etc) like VPCs, subnets, firewalls, and load balancing, TLS and SSL.
- Experience with CI/CD pipelines (e.g., Jenkins, Github Actions) & version control.
- Python, react/next - Monitoring and logging to analyze & track resource utilization, application performance, and identify potential issues, Grafana, Prometheus, Loki or ELK.
- Experience with AWS, particularly EKS, serverless, queue & various databases.
- Solid knowledge Kubernetes.