Restaurant365 is hiring a

Site Reliability Engineer

Full-Time
Remote
Restaurant365 is a SaaS company disrupting the restaurant industry! Our cloud-based platform provides a unique, centralized solution for accounting and back-office operations for restaurants. Restaurant365’s culture is focused on empowering team members to produce top-notch results while elevating their skills. We’re constantly evolving and improving to make sure we are and always will be “Best in Class” ... and we want that for you too!

The SRE will be assisting in the responsibilities for supporting, enhancing, and maintaining our infrastructure and cloud services. Qualified candidates will demonstrate immediate technical aptitude, as well as propensity for learning new tools and techniques quickly in a fast-paced environment. Excellent candidates will be responsible for collaborating with the devops and development teams on efforts to help sustain a healthy responsive system. The SRE team is the front line for supporting our system and developing a best-in-class monitoring platform. The candidate will propose enhancements for system health, performance, and reliability to deliver SaaS based services for Restaurant365 customers.

How you'll add value:

  • Responding to production incidents and determining how we can prevent them in the future.
  • Triaging and troubleshooting production issues to ensure reliability and performance.
  • Identifying and automating manual processes.
  • Continuously evolving our monitoring tools and platform.
  • Promoting and applying best practices for building scalable and reliable services across engineering.
  • Developing and maintaining technical documentation/diagrams, runbooks, and procedures.
  • Provide “Always On” support for a 24x7 online environment, by participating in an on-call rotation providing response to production incidents and participating in root cause analysis and problem management.
  • Automate Public cloud environments by utilizing tools such as Terraform, Ansible, and cloud formation.
  • Work within strict time frames following change management protocols to provide maximum uptime.
  • Implement, review, and adhere to security policies along with working with audit teams. 
  • Research and remediate system vulnerabilities.
  • Interact and coordinate with architects, developers, vendors, and internal business partners.
  • Maintain documentation of all Cloud infrastructure related components.
  • Maintain a solid working knowledge of current infrastructure and future trends.
  • Other duties as assigned.

What you'll need to be successful in this role:

  • Extensive experience with SRE methodologies and processes. 
  • Automation expert with coding skills and a mindset to automate manual/repetitive tasks with PowerShell, Bash, Perl, PHP, or containers.
  • Extensive scripting experience with Terraform, YAML, Ansible, Python.
  • Automation experience in public cloud environments, with a strong understanding of infrastructure as code. 
  • Experience in continuous deployment and lifecycle management using tools such as Gitlab, Git, stash.  
  • Linux engineering skills and working knowledge of Windows. 
  • Working experience with Nginx and Apache Tomcat. 
  • Azure or AWS: 2+ years hands on administration and automation of various Azure or AWS services (Azure AKS, Azure Functions, Azure Blob, AWS ECS, AWS EKS, LAMDA, S3, ALB/ELB, etc...).  
  • Experience with Windows and Linux. 
  • Ability to effectively prioritize and execute tasks in a high velocity environment. 
  • Minimum of 2 years of related experience with a bachelor's degree; or equivalent work experience.  
  • Strong written, oral, and interpersonal communications skills. 
  • AWS or Azure cloud certification is preferred. 
  • Preferred experience using: Jira, Prometheus, Grafana, ELK, Site24x7. Nagios a bonus!

R365 Team Member Benefits & Compensation

  • This position has a salary range of $100K-$130K. The above range represents the expected salary range for this position. The actual salary may vary based upon several factors, including, but not limited to, relevant skills/experience, time in the role, business line, and geographic location. Restaurant365 focuses on equitable pay for our team and aims for transparency with our pay practices.
  • Comprehensive medical benefits, 100% paid for employee
  • 401k + matching
  • Equity Option Grant
  • Unlimited PTO + Company holidays
  • Wellness initiatives

  • #BI-Remote

R365 is an Equal Opportunity Employer and we encourage all forward-thinkers who embrace change and possess a positive attitude to apply.
Apply for this job

Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!

Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Site Reliability Engineer Q&A's
Report this job
Apply for this job