System Reliability Engineer, Infrastructure R&D

TLDR

Deploy and manage physical and virtual infrastructure for R&D teams, ensuring reliability, scalability, and fault tolerance while supporting Azure DevOps Server.

Veeam is the Data and AI Trust Company, specializing in helping organizations ensure their data and AI are fully understood, secured, and resilient to enable the acceleration of safe AI at scale. As the market leader in both data resilience and data security posture management, Veeam is built for the convergence of identity, data, security, and AI risk. Headquartered in Seattle with offices in more than 30 countries, Veeam protects over 550,000 customers worldwide, who trust Veeam to keep their businesses running. Join us as we go fearlessly forward together, growing, learning, and making a real impact for some of the world’s biggest brands.

#LI-REMOTE

#LI-JC

About the Role

R&D Infrastructure is a dedicated, highly isolated environment designed to support the R&D department’s unique needs. Our team manages a wide range of production and lab equipment, ensuring reliable operation, scalability, and fault tolerance. We provide full-cycle support for Azure DevOps Server, including project creation, workflow tuning, custom permissions, backup, and migration.

With a large and diverse fleet of build servers—including rare hardware—we select, deploy, and balance equipment for specialized tasks, maintaining close collaboration with R&D teams. We also manage shared storage solutions for build artifacts, enabling efficient replication and seamless load balancing across locations.

**This role is remote within the US, but we are prioritizing candidates based near the Atlanta area as occasional onsite presence may be required.

What You’ll Do

  • Deploy and manage physical and virtual infrastructure for R&D teams, from bare-metal server setup to high-density, heterogeneous virtualized clusters
  • Be available for periodic on-site visits to data centers to support physical hardware deployment, maintenance, and issue resolution
  • Administer and support Azure DevOps Server (On-Premises and Cloud) for source code version control
  • Assist R&D teams with troubleshooting and optimizing build processes
  • Diagnose and resolve performance issues in high-utilization virtualization clusters and storage systems
  • Design optimized, purpose-specific server and storage hardware configurations in collaboration with procurement teams
  • Investigate and resolve issues reported by R&D teams and automated monitoring tools through thorough root cause analysis
  • Contribute to the design and implementation of disaster recovery strategies
  • Maintain and enhance internal documentation
  • Identify and implement opportunities for process automation and efficiency improvements

What You’ll Bring

  • Self-sufficient, proactive, and results oriented
  • Strong verbal and written communication skills, with the ability to explain complex topics to audiences with varying levels of technical expertise
  • 5+ years of experience administering and troubleshooting Active Directory, Hyper-V, SQL Server, and VMware vSphere products
  • 3+ years of experience designing, implementing, and troubleshooting sophisticated, highly utilized virtualization clusters built on shared storage and complex network topology
  • 3+ years of experience administering Azure DevOps Server (Microsoft Team Foundation Server), including data migration between different platform versions
  • Experience administering Microsoft Azure
  • Experience writing advanced PowerShell scripts, including those that utilize 3rd-party modules
  • Experience configuring monitoring systems from scratch, with a focus on optimizing triggers and alerts
  • Deep knowledge of the OSI model and network traffic virtualization
  • Due to the fact that this position will deal with highly sensitive data and will support federal customers, we are only considering US citizens at this time. Security clearance is not required, but there is a slight chance it maybe requested in the future.

Bonus Skills

  • Familiarity with *nix systems such as Linux, macOS, and AIX
  • Familiarity with Git and TeamCity
  • Experience designing and implementing Disaster Recovery Plans
  • Familiarity with off-site and GFS backup strategies using Veeam products such as Backup & Replication and Veeam Agents
  • Familiarity with the technical nuances of software development (from source code to RTM product)
  • Familiarity with hardware capacity planning and procurement processes in large organizations

What you'll get

  • Unlimited paid time off, 12 paid holidays, plus 4 extra global VeeaMe Days for self-care and 24 paid volunteer hours annually through Veeam Cares
  • Paid parental leave: 8 weeks for all parents, 16 weeks for birthing parents
  • Medical, dental, and vision coverage starting on your first day
  • Mental health support, therapy sessions, and digital wellness tools via our Employee Assistance Program
  • 401(k) retirement plan with company matching contributions
  • Fertility, adoption, and surrogacy support through Maven, plus paid volunteer time
  • AirVet: 24/7 virtual veterinary care at no cost
  • Legal services, identity protection, and supplemental health insurance options
  • Tax-advantaged spending accounts for healthcare, dependent care, and commuting
  • Opportunities to learn and grow through on-demand libraries (LinkedIn Learning, O’Reilly), mentoring, workshops, and learning events like our annual Global Day of Learning

 

Compensation Transparency

Veeam is committed to pay transparency and equitable compensation. For this role, the compensation range below reflects the expected total target compensation (TTC), inclusive of base pay and a competitive performance-based bonus. For roles with a commission plan, the compensation range represents On Target Earnings (OTE), which includes base salary plus variable commission. When determining compensation, Veeam takes into consideration factors such as experience, education, skills, and geographic zone. Offers are typically made below the midpoint of the range.

In addition to compensation, Veeam provides a comprehensive benefits package, including health coverage, retirement plans, and unlimited time off.

U.S. Geographic Zones & Compensation Ranges (TTC / OTE)
Zone 1: San Francisco Bay Area, New York City Boroughs
$167,400$310,900 USD
Zone 2: Washington, California (excluding San Francisco Bay Area)
$153,500$285,000 USD
Zone 3: Texas, Illinois, North Carolina, Colorado, Massachusetts, Pennsylvania, Virginia, Oregon, Nevada, Hawaii, New York (excluding NYC boroughs); Sales roles located in Georgia, Ohio, and Arizona
$139,500$259,000 USD
Zone 4: All other US locations
$121,400$225,300 USD

Veeam Software is an equal opportunity employer and does not tolerate discrimination in any form on the basis of race, color, religion, gender, age, national origin, citizenship, disability, veteran status or any other classification protected by federal, state or local law. All your information will be kept confidential.

Please note that any personal data collected from you during the recruitment process will be processed in accordance with our Recruiting Privacy Notice.  

The Privacy Notice sets out the basis on which the personal data collected from you, or that you provide to us, will be processed by us in connection with our recruitment processes. 

By applying for this position, you consent to the processing of your personal data in accordance with our Recruiting Privacy Notice.

By submitting your application, you acknowledge that the information provided in your job application and any supporting documents is complete and accurate to the best of your knowledge. Any misrepresentation, omission, or falsification of information may result in disqualification from consideration for employment or, if discovered after employment begins, termination of employment.

Benefits

Health Insurance

Medical, dental, and vision coverage starting on your first day

Learning Budget

Opportunities to learn and grow through on-demand libraries (LinkedIn Learning, O’Reilly), mentoring, workshops, and learning events like our annual Global Day of Learning

Other Benefit

Tax-advantaged spending accounts for healthcare, dependent care, and commuting

Paid Parental Leave

Paid parental leave: 8 weeks for all parents, 16 weeks for birthing parents

Paid Time Off

Unlimited paid time off, 12 paid holidays, plus 4 extra global VeeaMe Days for self-care and 24 paid volunteer hours annually through Veeam Cares

Stock Options

401(k) retirement plan with company matching contributions

Wellness Stipend

Mental health support, therapy sessions, and digital wellness tools via our Employee Assistance Program

Veeam Software leads the market in data resilience, offering robust solutions for data backup, recovery, portability, security, and intelligence. Our platform supports a wide range of environments—including cloud, virtual, physical, SaaS, and Kubernetes—empowering organizations to maintain control over their data, ensuring it’s always protected and available. Trusted by over 550,000 customers globally, Veeam is dedicated to helping businesses not only recover from data loss but thrive beyond it.

View all jobs
Salary
$139,500 – $259,000 per year
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Reliability Engineer Q&A's
Report this job
Apply for this job