Scheduling Reliability Engineer

AI overview

Leverage your expertise in enterprise scheduling platforms and automation to enhance system reliability while contributing to a cutting-edge technology team.

Scheduling Reliability Engineer

A Career with point72’s technology team

As Point72 reimagines the future of investing, our Technology team is constantly evolving our firm’s IT infrastructure and engineering capabilities, positioning us at the forefront of a rapidly evolving technology landscape. We’re a team of experts who experiment and work to discover new ways to harness open-source solutions, modern cloud architectures, and sophisticated Artificial Intelligence (AI) solutions, while embracing enterprise agile methodologies. Our commitment to building and innovating in the AI space provides the framework intended to drive smarter decision making and enhance how we build and operate our platforms and applications.

 

What You'll Do

  • Serve as the Subject-Matter Expert (SME) for our enterprise scheduling platforms
  • Maintain, tune, and upgrade the scheduling environment to ensure stability and high availability
  • Develop and enhance automation solutions using PowerShell and other scripting languages to streamline workload orchestration
  • Build, configure, and refine monitoring dashboards, alerts, and reports to track system health, throughput, and performance
  • Lead incident response for high-priority scheduling failures including troubleshooting, resolving the issue, and performing a root-cause analysis
  • Define, establish, and report on SLIs, SLOs and SLAs for critical business workflows
  • Collaborate with cross-functional teams to onboard new workflows, optimize job dependencies, and implement best practices
  • Create and maintain comprehensive documentation, runbooks, and training materials for end users and support teams
  • Participate in a rotational on-call schedule to support 24/7 operations and critical incident management

 

What's Required

  • Bachelor’s degree in computer science, engineering or a related field, or equivalent work experience
  • 5+ years of experience in enterprise scheduling or workload automation tools, including ActiveBatch, CA Workload Automation, Control-M, and/or Autosys.
  • 2+ years of experience in a site reliability, DevOps, or production support role with exposure to SLA management and SLI/SLO frameworks
  • Hands-on expertise with ActiveBatch or similar workload automation tools including job scheduling, calendars, dependencies, security, and versioning
  • Familiarity with Apache Airflow concepts, including DAG-design, operators, executors, and deployment patterns
  • Strong scripting skills in PowerShell
  • Proven track record of troubleshooting complex, distributed workflows and performing root-cause analysis
  • Experience building and managing monitoring solutions using tools such as Splunk, Datadog, and/or Prometheus/Grafana
  • Ability to partner with application owners, business analysts, and infrastructure teams to drive continuous improvements
  • Excellent communication skills with the ability to translate technical concepts for non-technical stakeholders
  • Commitment to the highest ethical standards

 

 

We take care of our people

We invest in our people, their careers, their health, and their well-being. When you work here, we provide:

  • Fully-paid health care benefits
  • Generous parental and family leave policies
  • Mental and physical wellness programs
  • Volunteer opportunities
  • Non-profit matching gift program
  • Support for employee-led affinity groups representing women, minorities and the LGBTQ+ community
  • Tuition assistance
  • A 401(k) savings program with an employer match and more

 

About Point72

Point72 is a leading global alternative investment firm led by Steven A. Cohen. Building on more than 30 years of investing experience, Point72 seeks to deliver superior returns for its investors through fundamental and systematic investing strategies across asset classes and geographies. We aim to attract and retain the industry’s brightest talent by cultivating an investor-led culture and committing to our people’s long-term growth. For more information, visit https://point72.com/.

The annual base salary range for this role is $175,000-$250,000 (USD) , which does not include discretionary bonus compensation or our comprehensive benefits package. Actual compensation offered to the successful candidate may vary from posted hiring range based upon geographic location, work experience, education, and/or skill level, among other things.

 

 

Perks & Benefits Extracted with AI

  • Health Insurance: Fully-paid health care benefits
  • Learning Budget: Tuition assistance
  • Other Benefit: A 401(k) savings program with an employer match
  • Paid Parental Leave: Generous parental and family leave policies
  • Wellness Stipend: Mental and physical wellness programs

Point72 Asset Management, led by Steven Cohen, is a global firm specializing in diverse asset classes and strategies, prioritizing superior returns and ethical standards through innovative talent development and data-driven decision-making.

View all jobs
Salary
$175,000 – $250,000 per year
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Reliability Engineer Q&A's
Report this job
Apply for this job