Toronto, Canada

The Company

At Nylas, we specialize in making it easier for developers to add email, calendar, and contact management features into their applications. We provide tools called APIs, which streamline the integration of these functionalities, ensuring they are secure and effective. This enables better, safer, and more reliable communication within apps.

Supporting over 100,000 developers and collaborating with more than 900 companies globally, Nylas plays a pivotal role in how digital communication tools are built and utilized. Our technology spans various sectors, from healthcare to education, simplifying the complex process of app development related to communications. By reducing the barriers in communication technology, we empower developers to innovate and enhance user interaction across platforms.

The Role

Our SRE team is responsible for ensuring our products run reliably and efficiently. We manage an impressive scale of infrastructure, serving billions of API calls every day. We are responsible for our overall SLA uptime and Cost of Goods Sold (COGS) relative to Cloud Compute spend.

What You’ll Do

Support our engineering team with best practices and provisioning new infrastructure as necessary.
Maintaining and scaling a legacy system in AWS with Ansible, Python, MySQL, Terraform.
Maintaining our new Infrastructure in GCP with Kubernetes, Helm, ArgoCD, Terraform, GoLang, OpenSearch, Spanner, Redis.
Configuring and adjusting alerts and dashboards in NewRelic and Coralogix. Leveraging Fluent-Bit and OpenTelemetry.
Managing and improving our CI/CD pipelines using ArgoCD and Helm.
Take part in an on-call rotation and assist in debugging and resolving incidents.

What You Must Bring

Experience: Minimum of 5 years in production engineering, with hands-on experience in managing and scaling Linux-based production servers.
Communication and Empathy: Exceptional communication skills and a strong empathetic approach, understanding that effective teamwork and problem-solving require more than just technical skills.
Linux Proficiency: Advanced proficiency in navigating the Linux command line.
Logging and Observability: Demonstrated experience with platforms like New Relic, Coralogix, Grafana, and Prometheus. Candidates with expertise in tuning alerts, synthetics, and creating comprehensive health dashboards and reports will be preferred.
Configuration Management: Experience in automating systems using modern tools such as Chef, Ansible, or Puppet.
Containerization and Orchestration: Proven track record of deploying and managing services using Kubernetes and Docker.
Cloud Services: Practical experience with major cloud services like AWS, GCP, or Azure, focusing on deploying and maintaining scalable applications.
Programming Skills: Capability to write reliable code in at least one programming language such as Python, GoLang, or JavaScript. Note: While coding is part of the role, it will not be the central focus of our interview process.
Learning Agility: Ability to rapidly learn and adapt to new technologies and frameworks.
Automation and Infrastructure: Passion for building modern, scalable infrastructure and automating routine tasks to improve efficiency and reliability.

Perks/Benefits

Healthcare: Extended healthcare coverage for you and your family
Unlimited Paid Time Off (PTO): We take this very seriously as we care about the well-being of our employees
RRSP with 3% employer contribution
Education Stipend: $1,250 annual education & development benefit
Cell Phone: $60 per month stipend towards cell phone reimbursement
Fully Paid Parental Leave: 12 weeks parental leave (maternity & paternity)

Interview Process

Round 1: 30 minute phone call with the Recruiter
Round 2: 60 minute Google Meet discussion with the Hiring Manager.
Round 3: Three (3) Google Meet discussions with various Nylas leaders including a live coding assignment with a team member (max 3 hours).

During the various discussions, candidates selected to meet with us are strongly encouraged to not only discuss their knowledge, skills, experience, and abilities but also to showcase examples of their current or previous work. We expect you to clearly outline the "what," "why," and "how" behind your contributions.

The estimated base salary range for this position is $125,000 to $150,000. Actual compensation will be determined based on individual qualifications, which are objectively assessed during the interview process. Factors influencing salary include knowledge, skills, experience, and abilities.

Apply for this job

Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!

Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Senior Site Reliability Engineer Q&A's

Report this job