Senior Site Reliability Engineer, Observability
Reston, VA or Remote
This position can be remote within the U.S.
Who we are...
ScienceLogic is going through a product transformation and the Site Reliability team is at the forefront of it. We are responsible for the design, deployment, and maintenance of the Cloud Infrastructure used for running the company’s revenue generating go-forward SaaS product line.
ScienceLogic’s current SaaS product is a single tenancy, highly available and secure platform used by many customers for achieving their AIOps objectives. Cloud Operations leads the SaaS portfolio from the front by onboarding new customers on their own dedicated instance of the product, performing capacity planning, platform maintenance, upgrades, security and triaging incident response for the SaaS platform.
Overall, we’re passionate about automation and solving complex business and technology challenges. Our team combines SRE, DevOps, Software Development and Information Security knowledge to help make Cloud operations agile, elastic inside the security and governance framework boundaries. If you are well versed in cloud technologies, have an automation mindset and are ardent follower of the SRE discipline…then our team will be benefited by your skillset!
What we're looking for...
We’re seeking an experienced Site Reliability Engineer who is passionate about building and owning modern monitoring and observability solutions at scale. You’ll play a key role in designing proactive monitoring strategies, defining SLIs/SLOs, automating detection and remediation, and improving platform reliability across our SaaS environment.
The ideal candidate is a hands-on engineer with strong cloud, automation, and scripting experience, deep familiarity with tools like Prometheus, AWS CloudWatch, and New Relic, and a collaborative mindset. You enjoy solving complex problems, mentoring others, and continuously improving systems before issues impact customers.
What you'll be doing...
Qualities you possess...
Benefits & Perks
Don’t meet every single requirement? Studies have shown that women and people of color are less likely to apply to jobs unless they meet every single qualification. At ScienceLogic, we are dedicated to building a diverse, inclusive and authentic workplace, so if you’re excited about this role but your past experience doesn’t align perfectly with every qualification in the job description, we encourage you to apply anyway. You may be just the right candidate for this or other roles.
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, or any other applicable legally protected characteristics in the location in which you are applying
About ScienceLogic
ScienceLogic is a leader in IT Operations Management, providing modern IT operations with actionable insights to resolve and predict problems faster in a digital, ephemeral world. Its solution sees everything across cloud and distributed architectures, contextualizes data through relationship mapping, and acts on this insight through integration and automation.
All ScienceLogic employees have the responsibility to protect information assets, adhere to access controls, report suspicious activity, and comply with security and privacy policies.
#LI-Remote
ScienceLogic offers a comprehensive AIOps and IT infrastructure monitoring platform that centralizes IT operations and cloud management, providing actionable insights for faster problem resolution in a digital world.
Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!
Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.
Senior Site Reliability Engineer Q&A's