Fal

Senior/Staff Virtualization Engineer

$180,000 – $250,000 per year

TLDR

Design and implement high-performance custom compute environments, leveraging AI for automation and providing exceptional GPU performance for customer workloads.

You build the custom compute environments we deliver to customers — bare metal or virtual machines with GPU passthrough, dedicated Kubernetes clusters, and the networking that ties them together. You work across the full stack from Linux image building to overlay network design to cluster bootstrapping.

Key responsibilities

Build and deliver custom environments with excellent GPU performance for customer workloads
Leverage AI to an extreme level to automate provisioning, alerting and recovery
Provision and configure dedicated Kubernetes clusters tailored to customer requirements
Design and implement overlay networking (VLAN, VXLAN) and routing configurations (ECMP, BGP) and tunnels (strongSwan, IPSEC) for tenant isolation and performance
Build and maintain Linux images
Set up network monitoring and diagnostics for customer environments
Automate the end-to-end lifecycle of customer compute environments: creation, configuration, validation, and teardown

Requirements

5+ years experience with Linux virtualization: KVM/QEMU, libvirt, VFIO device passthrough, hugepages, NUMA, CPU pinning
Strong networking fundamentals: VXLAN, VLAN, ECMP, BGP, ARP, and the ability to debug packet-level issues (tcpdump, Wireshark)
Production experience building and operating Kubernetes clusters on bare metal (MetalLB)
Proficiency with Linux image building and OS provisioning (kickstart, cloud-init, PXE/iPXE)
Proficiency in Python, Bash, Ansible and Terraform
Deep experience with NVIDIA GPUs: drivers, MIG, container runtimes (nvidia-container-toolkit), InfiniBand, RDMA/RoCEv2 and GPUDirect for high-performance AI networking
Excellent communication and ability to drive technical decisions across teams
Self-starter who executes quickly, takes ownership, and constantly seeks improvement

Nice to have

Experience with SR-IOV, DPDK, or other high-performance networking technologies
Experience with shared network storage (Ceph, Lustre, Weka)
Experience with network automation tools (Netbox, Nautobot, Nornir)

Compensation

$180,000-250,000 plus equity + benefits

Location

San Francisco, CA

What we offer at fal

Interesting and challenging work
A lot of learning and growth opportunities
We are currently hiring in downtown San Francisco.
We offer visa sponsorship and will help you relocate to San Francisco.
Health, dental, and vision insurance (US)
Regular team events and offsites

Benefits

Health Insurance

Health, dental, and vision insurance (US)

Visa Sponsorship

We offer visa sponsorship and will help you relocate to San Francisco.

Apply for this job

Fal

Fal builds a generative media platform that empowers developers to create and scale multimodal AI applications effortlessly, providing ready-to-use APIs and intuitive interfaces. Focused on delivering robust infrastructure for the generative AI era, Fal combines expertise in distributed systems with custom compute environments to ensure high performance and reliability.

Founded: Founded 2021
Employees: 1-10 employees
Industry: Internet Software & Services

View company profile

Engineer

Report this job