We are seeking an experienced Senior Site Reliability Engineer (SRE) DevOps engineer to join our dynamic and innovative team. As a Senior SRE, you will play a key role in ensuring the reliability, availability, and performance of our systems and services. As a Senior DevOps engineer you will work closely with cross-functional teams to build and maintain a robust and scalable infrastructure while championing best practices for reliability, automation, and performance optimization and monitoring and alerting.
To Be Considered You'll Need:
- English fluency
- Your CV/resume submitted in English.
Key Responsibilities:
Incident Response: Lead and participate in incident response efforts, managing critical incidents to resolution, conducting post-incident analyses, and implementing preventive measures.
Performance Optimization: Identify and address performance bottlenecks, conduct load testing, and optimize system performance to meet service-level objectives (SLOs). with the team.
Capacity Planning: Collaborate on capacity planning efforts, ensuring that systems can handle current and future growth, and participate in capacity forecasting and resource allocation.Automation: Develop and maintain infrastructure as code (IaC) using tools like Terraform or CloudFormation, and automate routine operational tasks to improve efficiency and reduce manual intervention
Monitoring and Alerting: Implement and enhance monitoring, alerting, and logging systems to proactively detect issues, conduct root cause analysis, and ensure system health.Collaboration: Collaborate with development, operations, and other teams to bridge the gap between development and production environments, and promote a culture of reliability and automation. culture of collaboration to improve automation, efficiency, delivery, and software quality.
Documentation: Maintain detailed documentation of systems, processes, and configurations, and contribute to knowledge sharing within the team.
Must-Have Skills:
- Proven experience in a similar DevOps or SRE role, with a strong focus on incident response, performance optimization, and automation.
- Proficiency in at least one programming language (e.g., Python, Go, Java) for scripting and automation tasks.
- Experience with cloud computing platforms (Jenkins, AWS, Azure, GCP) and containerization technologies (e.g., Docker, Kubernetes).In-depth knowledge of infrastructure as code (IaC) principles and tools.
- Strong expertise in implementing and managing monitoring and alerting solutions (e.g., Prometheus, Grafana, ELK Stack).Excellent problem-solving and troubleshooting skills, with a deep understanding of system and network fundamentals.
- Experience with version control systems (e.g., Git) and continuous integration/continuous deployment (CI/CD) pipelines.
Desirable Skills:
- Relevant certifications (e.g., Microsoft Certified: DevOps Engineer Expert, AWS Certified DevOps Engineer). Familiarity with microservices architecture and service mesh technologies.
- Experience with configuration management tools (e.g., Ansible, Puppet, Chef).Knowledge of database administration and optimization.
- Security best practices and experience with security tools and compliance.
- Strong communication skills and the ability to work collaboratively in a cross-functional environment.
- Prior experience mentoring or leading junior DevOps engineer or SRE team members.
Our benefits:
- Health plan and dental plan;
- Meal allowances;
- Childcare assistance;
- Extended parenting leave;
- Gympass
- Annual profit-sharing distribution;
- Life insurance;
- Partnership with an online mental health platform;
- CI&T University;
- Discount Club;
- Support Program: financial; psychological guidance; nutritionist and more;
- Pregnancy course and responsible parenthood;
- Partnership with online course platforms
- Platform for language learning;- And many others.
#LI-AG1
#MidSenior