About HighLevel:
HighLevel is an AI powered, all-in-one white-label sales & marketing platform that empowers agencies, entrepreneurs, and businesses to elevate their digital presence and drive growth. We are proud to support a global and growing community of over 2 million businesses, comprised of agencies, consultants, and businesses of all sizes and industries. HighLevel empowers users with all the tools needed to capture, nurture, and close new leads into repeat customers. As of mid 2025, HighLevel processes over 4 billion API hits and handles more than 2.5 billion message events every day. Our platform manages over 470 terabytes of data distributed across five databases, operates with a network of over 250 microservices, and supports over 1 million hostnames.
Our People:
With over 1,500 team members across 15+ countries, we operate in a global, remote-first environment. We are building more than software; we are building a global community rooted in creativity, collaboration, and impact. We take pride in cultivating a culture where innovation thrives, ideas are celebrated, and people come first, no matter where they call home.
Our Impact:
As of mid 2025, our platform powers over 1.5 billion messages, helps generate over 200 million leads, and facilitates over 20 million conversations for the more than 2 million businesses we serve each month. Behind those numbers are real people growing their companies, connecting with customers, and making their mark - and we get to help make that happen.
About the Role:
We’re looking for a Lead Engineer to design and build a new Audit Logging System that captures, stores, and exposes platform-wide events for compliance, security, and transparency.
This system will form the foundation for SOC2, GDPR, and internal observability, serving millions of users across thousands of agencies. You’ll own end-to-end architecture — from event ingestion to long-term storage and query APIs — while collaborating with Security, Platform, and Product teams to define what “auditability” means across HighLevel.
This is a backend-heavy role that blends systems design, event-driven architecture, and cloud-scale infrastructure engineering.
Responsibilities:
Architect and implement a high-throughput event ingestion system using Kafka, Pub/Sub, or similar queues
Design a versioned, queryable event schema for all audit and user actions
Build resilient APIs and SDKs for other microservices to publish and consume audit events
Ensure data immutability, ordering guarantees, and traceability across distributed systems
Design a storage architecture optimized for query performance, retention, and cost-efficiency
Work with analytical and time-series stores like ClickHouse, Elasticsearch, or BigQuery
Define data retention, partitioning, and archival strategies to support multi-year compliance
Enable real-time and historical querying with proper access control and indexing
Partner with the Security team to define audit categories, critical events, and data governance
Implement access policies for sensitive logs using RBAC and scoped permissions
Ensure the system meets SOC2, GDPR, and data retention compliance requirements
Build mechanisms for tamper detection and data verification
Design the system to scale to billions of events/day while maintaining low-latency queries
Build observability and alerting pipelines for ingestion failures and event delays
Implement idempotent, retry-safe event processing to ensure durability and reliability
Partner cross-functionally with Users, IAM, Platform, and Infra teams to drive adoption
Define and evangelize audit logging standards and SDKs for consistent event capture
Lead design reviews, share technical context, and mentor engineers contributing to the platform
Work closely with DevOps to optimize deployment, monitoring, and cost efficiency
Required Candidate Profile:
5+ years in backend or distributed systems engineering
Proven track record building data pipelines, logging, or observability systems
Experience designing multi-tenant, event-driven architectures in production-scale SaaS
Strong understanding of system reliability, fault tolerance, and compliance constraints
Required Technical Skills:
Languages: TypeScript, Go, Java, or Python
Frameworks: NestJS, Express.js, or other micro-service frameworks
Messaging: Kafka, Google Pub/Sub, or Kinesis
Databases: ClickHouse, Elasticsearch, BigQuery, or similar
Cloud: GCP or AWS, Kubernetes, Terraform
Observability: Prometheus, Grafana, OpenTelemetry
Security: OAuth, JWT, encryption, RBAC, audit compliance
EEO Statement:
The company is an Equal Opportunity Employer. As an employer subject to affirmative action regulations, we invite you to voluntarily provide the following demographic information. This information is used solely for compliance with government recordkeeping, reporting, and other legal requirements. Providing this information is voluntary and refusal to do so will not affect your application status. This data will be kept separate from your application and will not be used in the hiring decision.
#LI-Remote #LI-HB1