Kraków, Poland

Full-Time

The Site Reliability Engineer is responsible for maturing ActiveCampaign’s SaaS platforms to improve reliability, performance, efficiency and scalability. This person will play a critical role in advocating, designing, and implementing technology solutions to scale our cloud platform in support of a rapidly growing geo-diverse customer base. The position plays an integral role in defining and assessing the organization's goals in delivering well architected, implemented and managed infrastructure services operating at highly performant, cost efficient, scalable and resiliency levels of service.

What your day could consist of:

Establishing engineering excellence in SRE by driving observability, scalability, high availability, reliability and sustainability of ActiveCampaign platforms.
Improving SRE efficiencies by improving coding and deployment standards and by increasing automation and self-service capabilities.
Designing and implementing effective management solutions for globally distributed kubernetes/cloud native and monolith based applications spread across AWS regions to ensure reliability, elasticity, performance and security.
Incidents troubleshooting availability and participating in on-call support rotations

What is needed:

3+ years of IaC development, DevOps, and Site Reliability Engineering experience.
1+ years experience working with an array of observability tools including Datadog, New Relic, Grafana, Loki, Prometheus and Opentelementry.
1+ public cloud experience including of AWS and Kubernetes Cloud Native and Open Source tools.
1+ years of experience with IaC and configuration management tooling such as Terraform, ArgoCD, Crossplane, Pulumi and Chef.
1+ years of experience architecting, implementing and operationalizing modern cloud native platforms (Kubernetes, microservice containers and applications including observability platforms for logging, service metrics, distributed tracing, APM, event correlation, alerting and notification

Additional Experience a plus:

Migration of monolithic based applications to kubernetes experience.
Engineering expertise in managing, maturing and improving the performance, scalability and security of PHP, MySQL, Java and NoSQL applications stacks.
Experience managing high volume, low latency and high throughput services and related technologies such as ProxySQL, Memcached and Redis
Self-motivated and strong sense of ownership of tasks and ability to lead and mentor fellow SRE team members.
Shell scripting, Python, PHP and Java experience

Apply for this job