Western Digital is hiring a

Site Reliability Engineer - Infra and DevOps

San Jose, United States
Full-Time

Western Digital reliance on software and software development workflows is growing by leaps and bounds as a leading provider of Storage Solutions. As Secure Development Factory (SDF) Site Reliability Engineer - DevOps, you will be at the heart of Western Digital’s engineering process, delivering the software development tools and infrastructure that empowers engineering teams to develop and deliver high quality products quickly. You will play a pivotal role in ensuring the reliability, scalability, and performance of our IT infrastructure and DevOps tools. You will lead by example and collaborate closely with Engineering teams to align our efforts with customer requirements. Your technical expertise, adaptability, and commitment to excellence will drive the success and empower our stakeholders to develop and deliver high quality products faster reducing time to market without sacrificing security, development velocity, stability, code quality or code health.

The ideal candidate will have a passion for technology, a relentless focus on the customer experience and an ability to multitask, assimilate data, make decisions and prioritize complex work while paying attention to the details. Communication with internal customers, vendors and co-workers in a clear and professional manner is an absolute must. This position is open to candidates located in PST time zone.

Key Responsibilities

  • Observability and Monitoring: Design, implement, and continuously improve monitoring and observability solutions to ensure effective and real-time visibility into system performance.
  • Best Practices: Advocate for and implement best practices in SRE, DevOps, and automation, with a focus on enhancing platform stability and performance.
  • Automation: Lead automation efforts to streamline processes, reduce manual tasks, and improve operational efficiency.
  • Architecting and Designing: Contribute to the architecture and design of systems and applications, aligning them with reliability and scalability goals.
  • Technical accountability: Provide technical ownership in the SRE team, fostering a collaborative and growth-oriented environment.
  • Ownership: Take ownership of system reliability, meet Service Level Objectives (SLOs), and ensure customer satisfaction.
  • Collaboration: Work closely with Engineering teams to understand customer requirements and collaborate on solutions.
  • Adaptability: Stay updated with emerging technologies and adapt quickly to evolving requirements and challenges.
  • Upskilling: Continuously upskill in newer technologies and share knowledge within the team.
  • Team Player: Collaborate effectively with team members and contribute to a positive team culture.
  • Professional Behaviour: Demonstrate professionalism, integrity, and a commitment to the highest ethical standards.
  • Documentation: Maintain thorough and well-organized documentation of systems and processes.

Required Skills and Qualifications

  • Candidates MUST POSSESS a B.S. C.S, I.T., E.E., or M.E., + 6 to 10 years of hands-on experience in DevOps tools and SRE practices.
  • MUST POSSESS Administration experience on DevOps tools such as Artifactory, Jenkins, Git, Blackduck, SAST/DAST tools, etc.
  • MUST POSSESS A Very good understanding of Infrastructure at the Server, VMWare, Storage and Networking
  • Exceptional analytical, problem solving, and troubleshooting skills to manage complex process and technology issues.
  • Extensive experience in Ansible automation (Research, Write, Maintain, and Optimize roles/playbooks/modules)
  • Expertise in shell scripting, Python, and other configuration management tools like Terraform.
  • Development and customisation of CICD pipelines and onboarding applications with varying requirements
  • Experience in monitoring enhancements and metrics dashboarding using tools such as Icinga, Splunk, Prometheus & Grafana
  • Good to have experience in containerization technologies viz., Docker, Kubernetes.
  • Automation First mindset.
  • Focus on embedding Security postures on the systems.
  • Working experience in ha-proxy, load balancers, ldap/sso integration, security endpoint configurations
  • Knowledge of cloud computing platforms (e.g., AWS, Azure, GCP) is a plus
  • Excellent communication and collaboration skills.

Western Digital thrives on the power and potential of diversity. As a global company, we believe the most effective way to embrace the diversity of our customers and communities is to mirror it from within. We believe the fusion of various perspectives results in the best outcomes for our employees, our company, our customers, and the world around us. We are committed to an inclusive environment where every individual can thrive through a sense of belonging, respect and contribution.

Western Digital is committed to offering opportunities to applicants with disabilities and ensuring all candidates can successfully navigate our careers website and our hiring process. Please contact us at [email protected] to advise us of your accommodation request. In your email, please include a description of the specific accommodation you are requesting as well as the job title and requisition number of the position for which you are applying.

#LI-TD1

Compensation & Benefits Details

  • An employee’s pay position within the salary range may be based on several factors including but not limited to (1) relevant education; qualifications; certifications; and experience; (2) skills, ability, knowledge of the job; (3) performance, contribution and results; (4) geographic location; (5) shift; (6) internal and external equity; and (7) business and organizational needs.
  • The salary range is what we believe to be the range of possible compensation for this role at the time of this posting.  We may ultimately pay more or less than the posted range and this range is only applicable for jobs to be performed in California, Colorado, New York or remote jobs that can be performed in California, Colorado and New York.  This range may be modified in the future.
  • You will be eligible to participate in Western Digital’s Short-Term Incentive (STI) Plan, which provides incentive awards based on Company and individual performance.  Depending on your role and your performance, you may be eligible to participate in our annual Long-Term Incentive (LTI) program, which consists of restricted stock units (RSUs) or cash equivalents, pursuant to the terms of the LTI plan. Please note that not all roles are eligible to participate in the LTI program, and not all roles are eligible for equity under the LTI plan. RSU awards are also available to eligible new hires, subject to Western Digital’s Standard Terms and Conditions for Restricted Stock Unit Awards.
  • We offer a comprehensive package of benefits including paid vacation time; paid sick leave; medical/dental/vision insurance; life, accident and disability insurance; tax-advantaged flexible spending and health savings accounts; employee assistance program; other voluntary benefit programs such as supplemental life and AD&D, legal plan, pet insurance, critical illness, accident and hospital indemnity; tuition reimbursement; transit; the Applause Program, employee stock purchase plan, and the Western Digital Savings 401(k) Plan.
  • Note: No amount of pay is considered to be wages or compensation until such amount is earned, vested, and determinable. The amount and availability of any bonus, commission, benefits, or any other form of compensation and benefits that are allocable to a particular employee remains in the Company's sole discretion unless and until paid and may be modified at the Company’s sole discretion, consistent with the law.
Apply for this job

Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!

Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Site Reliability Engineer Q&A's
Report this job
Apply for this job