Cato Networks
Cato Networks

AI Security - AI Platform Engineer

TLDR

This versatile engineering role focuses on building a real-time AI runtime platform for high-throughput, low-latency AI security decisions in production, involving GPU and CPU integration.

Welcome to the future of cloud networking and security!  

Cato Networks is the first company to converge enterprise networking and security into one centralized and global service that is delivered by cloud. It is led by networking and security pioneer Shlomo Kramer (Check Point, Imperva) and early investor (Palo Alto Networks, Exabeam, Trusteer and more). Cato’s unique technology inspired a brand-new product category, later named “SASE” by Gartner and a market expected to reach $28.5 billion by 2028.

This is your opportunity to get on the rocket ship and join a company that is building a cutting-edge enterprise network and secure cloud platform, and is on a fast track to becoming the worldwide market leader – don’t miss it!

 

Cato is building a real-time AI runtime platform for security algorithms running inline across our global cloud and physical PoPs.
We are looking for an AI Platform Engineer to help build the infrastructure that powers high-throughput, low-latency AI security decisions in production.
You will work on a runtime engine that combines GPU-based models, from MMBERT-style models to LLMs, with CPU-based heuristics and security logic, optimized for scale, performance, reliability, and real-time execution. This is a versatile engineering role that spans AI runtime infrastructure, high-performance backend development, GPU inference, model lifecycle, and close collaboration with research teams to bring AI security algorithms into production.


Responsibilities
  • Build Cato’s AI security runtime platform for high-throughput, low-latency production serving.
  • Develop infrastructure for model serving, multi-model orchestration, and inline decision flows.
  • Optimize inference performance: batching, caching, streaming, GPU utilization, memory usage, and runtime acceleration.
  • Build backend orchestration and performance-critical services in Go.
  • Support the model lifecycle: registry integration, packaging, versioning, deployment, monitoring, and operational health.
  • Work closely with research and algorithm teams to productionize AI security models and algorithms at scale.


Requirements
  • 3+ years of hands-on experience in AI inference, production ML infrastructure, model serving, or MLOps.
  • Experience with production inference technologies such as Triton, vLLM, CUDA, Kubernetes, Docker, PyTorch, ONNX, TensorRT, or similar.
  • Strong understanding of low-latency, high-throughput production systems.
  • Experience with model lifecycle concepts: model registry, versioning, deployment, rollout, rollback, monitoring, and observability.
  • 3+ years of experience with Go, or strong experience with a similar high-performance backend language such as C++, Rust, or Java.

Cato Networks converges enterprise networking and security into a unified, cloud-based service, creating a new product category known as SASE. This innovative approach caters to businesses seeking a centralized solution for their networking and security needs.

Founded
Founded 2015
Employees
201-500 employees
Industry
Diversified Telecommunication Services
Total raised
$770M raised
View company profile
Report this job
Apply for this job