fal is building the fastest and most scalable infrastructure for AI inference. Fal Serverless powers 1,300+ endpoints on the fal Marketplace and handles tens of millions of requests per day across production workloads.
Enterprises use fal Serverless to deploy, operate, and scale custom AI models without managing infrastructure themselves. Autoscaling, observability, and operational complexity are handled end-to-end by fal’s platform and UI.
Serverless began as internal infrastructure built to support fal’s own scale and was released publicly to enterprise customers in early 2025. It is now a core, revenue-driving product with rapidly growing adoption.
fal is one of the fastest-growing AI startups, reaching Series D at a $4.5B valuation with a lean team of ~70 employees. You’ll be joining early, with meaningful ownership and direct impact on a foundational product.
As a Full Stack Engineer on Serverless, you will build the core product across frontend and backend that powers fal’s Serverless platform. This is a deeply product-focused role. You will work side-by-side with Product and Infrastructure to design and ship reusable, scalable systems that enterprise customers rely on in production every day.
You will be a foundational technical owner of fal Serverless as it scales to thousands of enterprise customers, with real responsibility, autonomy, and impact. This is a chance to help build a new product vertical from the ground up inside a company that is already scaling at rocket-ship speed.
$150,000 - $230,000 + equity + comprehensive benefits package
We are currently hiring in downtown San Francisco.
In the modern era, content is shifting from being human-made and algorithm-distributed to being generated on demand - personalized in real time for every audience, context, and moment. We’re Fal, and we’re building the infrastructure powering this transformation. Our platform is the first of its kind: a generative media stack for developers that enables real-time, AI-generated content across image, video, and audio. At the core is our serverless Python runtime, purpose-built to run massive ML models across thousands of GPUs with unmatched speed and efficiency. Applications built on Fal already serve millions of users - and we’re just getting started. Founded in 2021, we're scaling fast and backed by top investors including a16z, Bessemer, and Kindred. If you're an ambitious builder who wants to define the future of AI and media, we’d love to meet you.
Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!
Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.
Full-Stack Engineer Q&A's