Job Summary:
The DevOps Lead at CEQUENS oversees the development and operations team, ensuring seamless integration and continuous delivery of applications with a focus on automation, scalability, and infrastructure reliability. This role is pivotal in managing the deployment pipeline, from code development through all stages of production, optimizing processes, and maintaining system health. The DevOps Lead collaborates with software development, QA, and IT operations teams to enhance the deployment cycle and infrastructure management practices.
We are seeking a highly skilled and experienced DevOps Lead to spearhead our DevOps team. This individual will focus on Application Operations (AppOps), ensuring seamless deployment, availability, and performance of applications across environments. The ideal candidate should be proficient in Kubernetes (K8s), CI/CD pipelines, automation frameworks, and have a strong understanding of SLA management.
Main Areas of Responsibility::
1. Team Leadership:
- Lead, mentor, and manage a team of four DevOps engineers.
- Foster a culture of collaboration, innovation, and continuous improvement within the team.
2. AppOps & Operations Management:
- Own and enhance the AppOps lifecycle, including deployment, monitoring, troubleshooting, and incident response.
- Ensure adherence to the platform’s 99.99% SLA through proactive monitoring and performance optimization.
- Define and enforce SLOs, SLIs, and operational metrics.
3. Automation & CI/CD:
- Design, implement, and optimize CI/CD pipelines to ensure smooth and fast delivery.
- Automate repetitive tasks to improve efficiency and reduce operational overhead.
4. Infrastructure Management:
- Manage and optimize Kubernetes clusters for scalability, reliability, and cost-efficiency.
- Collaborate with architecture and engineering teams to ensure infrastructure aligns with application requirements.
5. Security & Compliance:
- Implement and maintain security best practices in all DevOps processes.
- Conduct regular compliance checks and audits as per company policies.
6. Incident Management & RCA:
- Lead incident response efforts, including root cause analysis (RCA) and post-incident reviews.
- Develop playbooks and disaster recovery plans to ensure swift recovery during outages.
Requirements
Qualifications and Skills:
- Experience: 6+ years in DevOps roles, with at least 2 years in a leadership capacity.
- Technical Expertise:
- Strong hands-on experience with Kubernetes and container orchestration.
- Proficiency in building and managing CI/CD pipelines (e.g., Jenkins, GitHub Actions, GitLab CI/CD).
- Expertise in scripting and automation tools (e.g., Python, Bash, Terraform, Ansible).
- Solid understanding of SLA management, monitoring tools (e.g., Prometheus, Grafana, New Relic), and incident management practices.
- Experience with cloud platforms (e.g., AWS, GCP, Azure).
- Leadership Skills: Proven ability to lead teams, resolve conflicts, and drive results.
- Problem-Solving: Analytical mindset with a proactive approach to resolving complex issues.
Preferred Skills and Certifications:
- Certifications in Kubernetes (CKA/CKAD) or cloud platforms.
- Knowledge of AppOps-specific tools and practices (e.g., Helm, Canary Deployments, Ansible, Terraform …).
- Familiarity with SRE practices, including error budgets and chaos engineering.