About the Role

Cyberhaven’s lineage technology tracks billions of data events, and this role is focused on building the intelligent service layer that makes that data actionable at scale. You will develop backend systems that wrap massive lineage datasets in an agentic framework, enabling fleets of AI agents to perform work traditionally handled by security consultants. These agents will autonomously audit customer environments, generate intelligence-driven security policies, and diagnose complex data security incidents by tracing file activity across months of history in seconds. This role sits at the intersection of data engineering, distributed systems, and applied AI, with a strong focus on turning rich product data into scalable, automated security outcomes.

What You’ll Do

Build the service layer that enables AI agents to perform complex, multi-step security workflows as software
Develop backend services that allow agents to reason over lineage data, security logs, and customer context
Implement AI agents that leverage product data to generate audits, recommendations, and diagnostics
Write high-performance SQL and data extraction logic in BigQuery to surface structured, agent-ready context from billions of lineage events
Design and consume API-first integrations connecting Cyberhaven’s platform to LLM providers and downstream customer workflows
Build automated diagnostic flows where agents iteratively query data, interpret results, and produce human-readable explanations of security incidents
Partner with internal teams to ensure agent-driven outputs are reliable, scalable, and aligned with real customer use cases

Who You Are

Strong experience building backend services using Python and/or Go
Familiarity with designing and implementing agentic workflows that manage state, tool usage, and multi-step reasoning
Comfortable writing and optimizing complex SQL over large, high-cardinality datasets in BigQuery
Experience modeling data for LLM consumption, including structured context windows, embeddings, or retrieval-based approaches
Hands-on experience designing and consuming RESTful APIs as part of distributed systems
Strong understanding of integrating external systems and services via APIs
Practical experience working with LLMs via APIs, including prompt design, structured outputs, and performance trade-offs
Experience building guardrails, validation layers, or verification mechanisms to ensure AI outputs are accurate and trustworthy

Joining Cyberhaven is a chance to revolutionize data security. Traditional tools fall short, but we’ve reimagined protection with AI-enabled data lineage that analyzes billions of workflows to understand data, detect risk, and stop threats. Backed by $250M from leading investors like Khosla and Redpoint, our team includes leaders who built industry-defining technologies at CrowdStrike, Palo Alto Networks, Meta, Google, and more. This role lets you shape the future of data security, alongside experts driven to help customers protect their most valuable information.

Cyberhaven is committed to creating a diverse environment and is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status.

AI Automation & Tooling Engineer

AI overview

About the Role

What You’ll Do

Who You Are