Operations-Focused Engineer

AI overview

Own the operational excellence of critical third-party platforms, ensuring reliability and performance through proactive monitoring and incident management.

REMOTE

About TTC

The Testing Consultancy (TTC) is a global specialist software testing company with a focus on helping organizations transform the way they deliver quality software. We have broad capabilities across a wide range of testing areas that enable our clients to increase the speed and quality of software development while reducing risk and cost.

Perks of working for TTC

  • Competitive Base Salary
  • Medical, Dental, Vision Benefits
  • 401K w/ company match
  • Paid Time Off
  • Paid Holidays
  • Work Life Balance
  • Relaxed Work Environment
  • Growth and Development Opportunities

Role summary

We’re looking for an operations-focused engineer to join our team. This role owns the day-to-day reliability and operational excellence of a portfolio of business-critical third-party enterprise platforms and integrations, partnering closely with engineering and cross-functional infrastructure teams to keep systems healthy, scalable, and secure. 

Responsibilities

  • Serve in an on-call rotation and lead incident response for production issues: triage, mitigation, escalation, and restoration.
  • Drive operational excellence: improve alert quality, reduce toil, document runbooks, and create repeatable operational processes.
  • Perform root cause analysis for incidents and recurring issues; drive corrective and preventive actions to completion.
  • Execute and coordinate maintenance activities (upgrades, patching, configuration changes) with minimal risk and downtime.
  • Build and maintain monitoring, dashboards, and health checks to detect issues early and reduce mean time to recovery.
  • Automate routine operational workflows using scripts and small tools; improve reliability through safe incremental change.
  • Partner cross-functionally (security, networking, storage, compute, vendor/third-party partners) to resolve complex issues.
  • Maintain accurate system documentation, operational standards, and service ownership practices across supported platforms.

Minimum qualifications

  • 3+ years experience in production operations, SRE, systems engineering, or production support for enterprise services.
  • Strong Linux/systems troubleshooting skills (processes, logs, performance, networking basics).
  • Experience participating in or leading on-call and handling production incidents with clear communication.
  • Proficiency in scripting/automation (e.g., Python and/or shell) and comfort with change management / peer review workflows.
  • Strong written and verbal communication; able to write clear runbooks and incident summaries.

Preferred qualifications

  • Experience operating third-party enterprise platforms (integration middleware, identity/auth systems, web/app tiers, databases, batch/scheduled jobs).
  • Familiarity with vulnerability remediation and patch management practices in production environments.
  • Demonstrated track record reducing operational toil and improving reliability metrics (MTTR, alert noise, incident recurrence).
  • Experience coordinating complex incidents across multiple teams and stakeholders.
  • Experience using Capirca for network provisioning, Chef for configuration management, and Infrastructure as Code and Containers for deployment. 

Success in the first 60–90 days

  • Ramp to primary on-call ownership for supported systems.
  • Demonstrate ability to independently troubleshoot common failure modes and follow operational playbooks.
  • Deliver at least 1–2 measurable reliability improvements (toil reduction, alert cleanup, monitoring gap closure, recurring issue fix).

Working style

  • Calm under pressure, structured problem-solver, prioritizes reliability and safety.
  • Proactive communicator who keeps stakeholders informed during incidents and planned work.
  • “Automate and document” mindset: reduces repeated manual work and makes operations scalable.

 

If your experience or qualifications is similar to our ideal of a successful candidate, please consider applying. Experience comes in many ways; skills may be transferred, but passion for your career can't be substituted. At TTC, we understand the importance of diversity and how much value it brings to the table. Diversity brings about creativity and new perspectives, which is why we beckon everyone to apply. 

Perks & Benefits Extracted with AI

  • Health Insurance: Medical, Dental, Vision Benefits
  • Other Benefit: Growth and Development Opportunities
  • Paid Time Off: Paid Time Off
Salary
$50,000 – $75,000 per year
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Engineer Q&A's
Report this job
Apply for this job