Software Engineer, Data Foundations
TLDR
Join the Data Foundations team to enhance data ingestion and management that powers Glean's enterprise AI solutions, impacting how knowledge is used across billions of documents.
- Build and scale connectors to a wide variety of SaaS and on-prem systems (Google Workspace, Microsoft 365, Slack, Salesforce, Jira, ServiceNow, GitHub, etc.).
- Handle full syncs, low-latency incremental updates via webhooks/APIs, rate-limiting, and complex authentication flows.
- Build advanced capabilities in datasources like actions, live-fetch, and query language support.
- Transform raw, unstructured enterprise content into rich, structured, permission-aware representations optimized for search and LLM reasoning.
- Design document schemas and enrichment pipelines (entity extraction, access-graph propagation, redactions, etc.).
- Expand the capabilities of AI products through deep integrations that allow us to automate tasks, perform complex queries grounded in enterprise data, and enhance our indexed corpus with live data.
- Own end-to-end correctness, freshness, and performance for petabyte-scale data flows.
- Solve hard problems in ordering, idempotency, exactly-once processing, backpressure, and retries across distributed queues, workers, and storage.
- Preserve fine-grained ACLs, deletions, and sensitivity constraints so AI answers are always grounded in what users are actually allowed to see.
- Partner closely with Search Serving, Product, Platforms, and Security teams to define how enterprise context is exposed to LLMs and agents.
- Continuously improve observability, alerting, and automation to onboard larger customers and more data sources with confidence.
- 3+ years building production backend or data infrastructure systems (Java, Go, C++, Python, etc.).
- Hands-on experience with distributed systems, data pipelines, queues, and large-scale storage (SQL/NoSQL).
- You think in SLOs, error budgets, failure modes, and correctness guarantees — not just features.
- Comfortable with strict consistency and permission-modeling challenges.
- Prior work on enterprise connectors, search/indexing, information retrieval, or security-sensitive systems is a strong plus.
- Passionate about making AI trustworthy by building the rock-solid data foundation underneath it.
- Power user of LLMs and AI tools in your own workflow.
- This role is hybrid (4 days a week in one of our SF Bay Area offices)
AI-First Mindset at Glean:
At Glean, AI fluency is core to how we work and we're committed to ensuring every new hire feels confident integrating AI into their everyday work. As part of the interview process, you'll complete a brief AI-focused exercise or discussion so we can understand how you think about, design, and use AI to drive impact in your role. Feel free to reference any tools, platforms, or workflows you use today — prior Glean experience isn't required.
Benefits
Education Stipend
You will receive an annual education and wellness stipends to support your growth and wellbeing
Free Meals & Snacks
We provide healthy lunches daily to keep you fueled and focused
Home Office Stipend
You'll receive a home office improvement stipend
Glean is a Work AI platform designed to help organizations optimize their operations through intelligent search and AI-driven capabilities. By offering a scalable and secure infrastructure, Glean empowers businesses across various industries to harness the full potential of AI while maintaining control and customization.