OpenX is hiring a

Site Reliability Engineer II

Kraków, Poland
Contractor
We are a pioneer in cloud computing in Poland. With the recent migration to Google Cloud Platform (GCP) we have the largest infrastructure cloud footprint in Poland. It’s on such a large scale, Google is working to solve our problems. 
We are seeking a Cloud SRE (Site Reliability Engineer) that will be primarily responsible for the performance, uptime, and growth of various OpenX systems and services on GCP. Much of your software development focuses on optimising cloud-native systems, orchestrating cloud infrastructure and eliminating manual work through automation.
Excellent communication skills are crucial in this position so you could successfully interact with globally distributed OpenX teams operating in a 24x7 manner.

Key Responsibilities:

  • Design, write and deliver software to implement and support large web-scale, highly-performant, highly-available infrastructure on GCP/AWS (e.g. Terraform)
  • Monitor infrastructure, respond to incidents, correct and improve systems to prevent incidents, and plan capacity
  • Support system deployments and product releases
  • Tune large-scale clusters for optimal performance and efficiency
  • Working closely with engineering, project management, and operational peers to develop innovative technical tools and solutions
  • Participation in on-call rotation

What you need to have to be successful:

  • At least 2 years of AWS/GCP experience
  • Bachelor’s degree in Computer Science, related technical field involving systems engineering, or equivalent practical experience
  • Shell scripting
  • Experience in one of the following: Java, Python, Go, or other
  • Good English skills

Desirable Qualifications:

  • Expertise in designing, analyzing and troubleshooting large-scale distributed systems
  • Good understanding of public cloud services and tasks, such as: VPC; load balancing; relational and non-relational datastores (e.g., Google Cloud SQL, Memorystore, AWS RDS); storage (e.g., GCS, AWS S3); monitoring (e.g., GCP Stackdriver, AWS CloudWatch, Prometheus); serverless computing (e.g., GCF, AWS Lambda); and auto-scaling
  • Kubernetes/Docker/Containers experience
  • Ability to debug and optimize code and automate routine tasks

Apply for this job

Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!

Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Site Reliability Engineer II Q&A's
Report this job
Apply for this job