Job Summary:
We are hiring a Lead Infrastructure & Cloud Engineer with a strong Wintel infrastructure foundation and current, hands-on capability in modern cloud infrastructure across Azure (primary) and AWS. This role exists to close a capability gap: we have deep on-prem expertise, and we need a leader who can define and drive modern cloud standards, guide technical direction, and uplift the team.
You’ll operate as a technical lead with an architecture mindset: creating reference designs, setting guardrails, making pragmatic trade-offs (security, resilience, cost), and leading delivery across infrastructure and hybrid cloud. This is not a DevOps role, you will collaborate with DevOps and engineers, but your focus is infrastructure/platform, governance, reliability, and technical leadership.
Job Responsibilities:
Cloud & Hybrid Architecture (Azure & AWS)
- Own the target-state hybrid cloud architecture and roadmap (12–24 months), aligning security, resilience, and cost requirements.
- Define reference architectures and standards: landing zones, network patterns, identity patterns, logging/monitoring, backup/DR, and environment separation.
- Lead design and implementation of secure cloud networking: VNets/VPCs, routing, VPN, ExpressRoute/Direct Connect, Private Link/Endpoints, load balancers, WAF where needed.
- Own cloud governance foundations: subscriptions/accounts, management groups, RBAC, naming/tagging, logging, budgets and policy guardrails.
Modern Cloud Operations (Hands-on Leadership)
- Ensure cloud platforms, services, and workloads remain on supported, secure versions; implement drift detection and lifecycle management.
- Establish platform observability: Azure Monitor/Log Analytics/App Insights, CloudWatch, OpenTelemetry where used; improve alert quality and operational readiness.
- Build and maintain backup/DR posture with tested RTO/RPO, runbooks, and regular restore/DR exercises.
- Drive FinOps discipline: cost allocation, tagging compliance, rightsizing, reservations/savings plans, and cost anomaly detection.
Security, Governance & Incident Readiness
- Ensure security controls are in place and effective (least privilege, secure baselines, encryption, key management, vulnerability/patch posture).
- Log & telemetry onboarding: own onboarding of data/log sources and integration with the SIEM (e.g., Microsoft Sentinel/Splunk) in partnership with Security.
- Lead incident response for infrastructure/cloud events: triage, investigation, reporting, RCA, and implementation of preventative controls and guardrails.
- Manage, document, and audit configuration changes; champion “repeatable by design” changes and reduce configuration drift.
Wintel & Core Infrastructure Leadership
- Provide technical leadership across core infrastructure services: Windows Server, AD DS, DNS/DHCP, certificates/PKI, and integration with Entra ID.
- Guide virtualisation/storage teams (VMware/Hyper-V, SAN/storage) towards cloud-aligned standards for resilience, security, and lifecycle.
Leadership and Uplift
- Act as the technical authority for infrastructure and hybrid cloud lead technical decisions and drive outcomes.
- Mentor and upskill engineers on modern cloud infrastructure practices; run knowledge sessions and codify standards into reusable patterns.
- Provide input during design and architectural discussions with DevOps and software teams; unblock delivery with clear, pragmatic guidance.
Requirements
Must-Have Skills & Experience
- Strong enterprise infrastructure background with a Wintel core (Windows Server, AD, DNS/DHCP, certificates) and operational discipline.
- Demonstrable, hands-on Azure production experience including:
- Identity/RBAC/Entra integration
- VNets, VPN/ExpressRoute, Private Link/Endpoints
- Azure Monitor/Log Analytics, backup/DR patterns, policy/guardrails
- Working knowledge of AWS production environments (accounts/VPC, security groups, IAM basics, CloudWatch).
- Strong troubleshooting and incident leadership across OS/network layers; confident with vendors/escalations.
- Scripting/automation mindset (strong PowerShell; Bash/Python beneficial).
- Ability to create architecture artefacts: reference designs, diagrams, standards, and decision records (ADRs).
Preferred Certifications
AZ-104 , AZ-305 or AZ-500
Desirable
- Conditional Access and privileged access controls (PIM), break-glass patterns, Zero Trust principles.
- Azure Policy/AWS Config, Defender for Cloud/Security Hub, GuardDuty; landing zone governance tooling.
- AWS Control Tower, IAM Identity Center, CloudFormation (read/maintain).
- Infrastructure-as-Code familiarity (Terraform/Bicep) preferred, but not essential.
- Exposure to containers/AKS/EKS and CI/CD concepts (as an enabling partner).
- Experience supporting web hosting environments (CDN/WAF, TLS/PKI, caching/performance).