Senior Data Engineer
TLDR
Own data infrastructure and drive fraud detection initiatives while collaborating cross-functionally to enhance real estate transaction security.
Data platform and pipeline engineering
Design, build, and operate the core data infrastructure: data lake, warehouse, orchestration, observability, and governance, using declarative configuration and infrastructure as code (Terraform or equivalent) so the platform is reproducible and auditable
Partner with platform and domain teams to design ingestion pipelines and implement declarative configuration for data sources across the stack
Architect the transformation layer: dimensional models, aggregation strategies, and incremental materialization patterns that balance query performance against pipeline cost at scale
Own streaming and near-real-time data flows for fraud signal propagation, transaction status events, and verification webhooks, with the reliability expectations those require
Build for scale: partition strategies, clustering, late-arriving data handling, and backfill patterns that hold up when data volume doubles
Business outcome ownership
Own the source-of-truth models for the metrics the business runs on: ARR, NRR, churn, transaction volume, fraud detection rates, customer health scores, and operational throughput
Make the numbers defensible: when a business leader challenges a metric, you can walk them through exactly how it is calculated, what is excluded, and why
Partner with Product, Finance, CS, and GTM to translate business questions into data models and help teams measure what actually matters
Engineering craft and standards
Write production-grade Python and SQL: modular, tested, version-controlled, and reviewable by someone who was not in the room when you wrote it
Implement CI/CD pipelines for data systems: automated testing, schema change detection, data contract validation, deployment gates, and cost optimization and performance tuning as ongoing practice, not one-time projects
Experience
6+ years in data engineering with primary, end-to-end ownership of a production data platform, not a supporting role on a large team
Direct experience designing and operating streaming or near-real-time pipelines (Kafka, Kinesis, Pub/Sub, Flink, or equivalent) at production scale, including debugging failures under load
Hands-on production experience with cloud-based data platforms (Snowflake, BigQuery, Redshift, Databricks, or equivalent) and a production-grade orchestrator (Airflow, Dagster, Prefect, or equivalent)
Technical depth
Expert SQL and distributed systems: window functions, recursive CTEs, query plan analysis, query concurrency management, and optimization strategies that go beyond adding an index
Strong Python for data engineering: production-quality pipeline code with error handling, idempotency, retry logic, and test coverage; Go is a meaningful plus
Dimensional modeling mastery: you understand the tradeoffs between normalized and denormalized designs, when SCDs are the right tool, and how incremental strategies affect downstream query semantics
Event-driven architecture fundamentals: exactly-once semantics, consumer group management, backpressure handling, offset management, and the operational realities of keeping a streaming pipeline healthy
Warehouse internals: clustering keys, materialized views, partition pruning, and cost optimization strategies that keep query costs from compounding as data volume grows
You instrument, measure, and verify that your work produced the outcome it was supposed to
You make architectural decisions independently, communicate outwardly, and document the reasoning so the decision survives you
You have joined teams where the data was a mess, and you shipped before the situation was fully resolved, because waiting for perfection was not an option
Benefits
Flexible Work Hours
Flexible vacation
Health Insurance
Health, dental, and vision Insurance (including a $0 option)
Award-winning culture
An award-winning culture
Paid Time Off
10 paid sick days
CertifID builds a digital identity verification solution that protects against wire fraud by validating credentials and securely sharing bank details. It caters to businesses that require secure financial transactions, providing peace of mind with insurance coverage on protected wires. What sets us apart is our commitment to enhancing security in financial operations through reliable, verifiable methods.
- Founded
- Founded 2017
- Employees
- 51-200 employees
- Industry
- Diversified Financial Services
- Total raised
- $36M raised