Senior Data Engineer
TLDR
Join a dynamic data engineering team responsible for building platforms that enable machine learning and analytics at scale using advanced technologies like Spark and Redshift.
The Data Engineering team builds and operates the analytical data platform that powers machine learning, data science, analytics, and reporting across Veriff. We are responsible for large-scale data ingestion, platform reliability, and enterprise data governance — ensuring Veriffians have access to accurate and timely data.
In this role, you will own and evolve our data lake and data warehouse infrastructure, driving platform-level data management, governance, and reliability at scale.
You'll help us protect honest people online by:
- Owning and evolving our data lake and data warehouse infrastructure using technologies such as Spark, Apache Iceberg, S3, Trino/Athena, and Redshift.
- Designing and maintaining platform-level data transformation pipelines in Python and SQL — focused on schema evolution, partitioning, compaction, and deduplication.
- Implementing optimized storage formats (Parquet, Avro, ORC), partitioning strategies, and indexing to improve query performance and reduce platform costs.
- Driving data governance initiatives — PII detection and classification, access control policies, data cataloging, lineage tracking, and data quality frameworks.
- Ensuring the availability, reliability, and cost efficiency of the data platform, including observability, monitoring, and alerting for pipeline and query engine health.
- Collaborating with ML, analytics, product, and engineering teams to define data contracts, maintain schema consistency, and provide clean, well-governed datasets.
- Contributing to disaster recovery strategy and multi-region reliability of the data platform.
You are the right future Veriffian for the job if you have:
- Strong experience with Python, SQL, and Apache Spark / PySpark for large-scale data processing.
- Deep knowledge of modern analytics platform architecture — object stores, columnar and row-based data formats (Parquet, Avro, ORC), orchestration tools, analytical query engines, schema registries, and data catalogs.
- Experience with data governance and data management at scale — PII handling, data cataloging, schema management, access control, and data quality frameworks.
- Experience designing and operating data lake and data warehouse infrastructure.
- Solid understanding of storage optimization — partitioning, compaction, and compression trade-offs.
- Experience building observability, monitoring, and alerting for data platforms.
- Strong problem-solving skills and comfort working with ambiguity — defining problems before solving them.
- A collaborative mindset — this role serves ML, analytics, and product teams as internal customers.
You're an especially awesome match if you have:
- Experience with Infrastructure as Code (IaC) and Terraform.
- Familiarity with containerization — Docker and Kubernetes.
- Experience with CI/CD pipelines for data platform deployments.
- Knowledge of data lake table formats beyond Iceberg (Delta Lake, Hudi).
- Familiarity with data catalog and metadata management tools (e.g., DataHub, Amundsen, AWS Glue Catalog).
- Understanding of data privacy regulations (GDPR) in the context of data engineering.
- Experience building streaming data pipelines.
- Experience with the AWS data stack.
- Flexibility to work from home
- Stock options that ensure your share in our success
- Extra recharge days on top of your annual vacation
- Comprehensive relocation support to Estonia or Spain
- Extensive medical, dental, and vision insurance to ensure you’re feeling great physically and mentally
- Learning and Development & Health and Sports budget that you are free to tailor to your own needs
- Four weeks of fully paid sabbatical leave after reaching your 5th work anniversary
Benefits
Equity Compensation
Stock options that ensure your share in our success
Health Insurance
Extensive medical, dental, and vision insurance to ensure you’re feeling great physically and mentally
Learning Budget
Learning and Development & Health and Sports budget that you are free to tailor to your own needs
Relocation support
Comprehensive relocation support to Estonia or Spain
Paid Parental Leave
Four weeks of fully paid sabbatical leave after reaching your 5th work anniversary
Paid Time Off
Extra recharge days on top of your annual vacation
Remote-Friendly
Flexibility to work from home
Veriff is an identity verification platform that empowers innovative organizations to connect with honest individuals by validating over 10,000 government-issued documents from more than 190 countries. With a diverse team and a focus on building trust online, we enable businesses to securely verify identities at scale, ensuring a safer digital environment.