Bengaluru, India

Full-Time

As a member of the Infrastructure Reliability Engineering team – You are responsible to lead a team of system engineers who focused on Kubernetes, microservices architecture, and cloud technologies. As part of this self-driven team, you will support critical container Infrastructure and ensure the stability of services by performing dedicated maintenance activities. You engage in automation activities, perform root cause analysis (RCA), and remediation. Knowledge of production support process including incident/change/problem management, call triaging, and critical issue resolution procedures.

Essential Functions

Infrastructure life cycle management and Production Support of container, cloud technologies and orchestration platforms
Knowledge of alerts and monitoring tools and system management tools for environments and configuration management and Cloud orchestration tools
Hardening, securing the Kubernetes cluster with monitoring and auditing dashboards.
Strong technical analytical and troubleshooting skills and possess an ability to explain technical concepts and provide guidance to staff.
Proficient to expert scripting and automation skills converting manual and maintenance functions into fully orchestrated automation.
Must have Strong Knowledge & experience in system monitoring techniques and tools supporting unattended operations.
Ability to operate in complex, highly secure, and highly available, operations environments and interact with the technology domain experts required to maintain those environments.
Excellent communication & interpersonal skills. Coaching other members of the support team, sharing technical and customer knowledge in a helpful and timely fashion
Responsible for partnering with the Platform, Engineering and Delivery Teams to deliver seamless infrastructure support for all Visa business lines.
Work closely with geographically distributed teams on technical challenges and process improvements.
Security Remediation process (vulnerability assessment and patch management)
Contribute to standardize and document operational procedures.
Responsible for adherence of established ITIL practice such as Incident, Change, Problem and Release Management
Be scheduled On-Call to support the infrastructure and our systems.
Work on shift days ( Sunday to Thursday or Tuesday to Saturday from 9:00AM to 6:00PM)
Provide strong leadership with a focus on attracting, motivating, and developing best-in-class talent. Mentor and coach teams to develop future leaders in alignment with company objectives.
Balance both leading a team and engaging directly with the work needed to accomplish objectives. Assist direct reports with ongoing prioritization and resource allocation to ensure that the crucial business initiatives are delivered.
Utilize leadership skills, problem solving and decision-making skills to facilitate and encourage participation of team members to meet objectives in congruence with approved standards and guidelines.

This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs.

Basic Qualifications
• 8+ years of relevant work experience with a Bachelor’s Degree or at least 5 years of experience with an Advanced Degree (e.g. Masters, MBA, JD, MD) or 2 years of work experience with a PhD, OR 11+ years of relevant work experience.

Preferred Qualifications
• 9 or more years of relevant work experience with a Bachelor Degree or 7 or more relevant years of experience with an Advanced Degree (e.g. Masters, MBA, JD, MD) or 3 or more years of experience with a PhD

• At least 5 years in the Container and Cloud ( AWS & GCP) with focused on DevOps and service-based systems engineering.
• Minimum 4 years hands-on experience with Kubernetes (on internals architecture of K8s).
• Minimum 2 years of experience with AWS /GCP
• Minimum 3 years of Scripting experience (Shell, Python ,Ansible , Terraform and YAML packages)
• Minimum 2 years of experience with Microservices based applications traffic routing ( i.e., Istio ServiceMesh )
• Experience with configuration management tools (Chef, Ansible, terraform etc.) is must.
• Working with tools surrounding the Kubernetes ecosystem such as helm, kubeadm, CSI, CNI etc. is must.
• Experience with CI/CD or GitOps pipeline Architecture ( i.e., ArgoCD, Code Fresh, Jenkins ) is must.
• Working knowledge of monitoring and logging tools: Prometheus, Graphana, Fluentbit ,Netcool, Humio
• Deep understanding of networking concepts

Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.

Apply for this job

Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!

Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Reliability Engineer Q&A's

Report this job

Visa is hiring a

Senior Manager - Infrastructure Reliability Engineering