Project – the aim you’ll have
Our customer provides innovative solutions and insights that enable our clients to manage risk and hire the best talent. Their advanced global technology platform supports fully scalable, configurable screening programs that meet the unique needs of over 33,000 clients worldwide. Headquartered in Atlanta, GA, they have an internationally distributed workforce spanning 19 countries with about 5,500 employees. Our partner perform over 93 million screens annually in over 200 countries and territories.
We are seeking a Senior Data Engineer with solid Python/PySpark programming skills to join the Data Engineering Team and help us build the Data Analytics Platform in Azure cloud.
Position – how you’ll contribute
- Develop reusable, metadata-driven data pipelines
- Automate and optimize any data platform related processes
- Build integrations with data sources and data consumers
- Add data transformation methods to shared ETL libraries
- Write unit tests
- Develop solutions for the Databricks data platform monitoring
- Proactively resolve any performance or quality issues in ETL processes
- Cooperate with infrastructure engineering team to set up cloud resources
- Contribute to data platform wiki / documentation
- Perform code reviews and ensures code quality
- Initiate and implements improvements to the data platform architecture
Expectations – the experience you need
- Programming: Python/PySpark, SQL
- Proficient in building robust data pipelines using Databricks Spark
- Experienced in dealing with large and complex datasets
- Knowledgeable about building data transformations modules organized as libraries (Python packages)
- Familiar with Databricks Delta optimization techniques (partitioning, z-ordering, compaction, etc.)
- Experienced in developing CI/CD pipelines
- Experienced in leveraging event brokers (Kafka /Event Hubs / Kinesis) to integrate with data sources and data consumers
- Understanding of basic networking concepts
- Familiar with Agile Software Development methodologies (Scrum)
Additional skills – the edge you have
- Understanding of stream processing challenges and familiarity with Spark Structured Streaming
- Experience with IaC (Terraform, Bicep or other)
- Experience running containerized applications (Azure Container Apps, Kubernetes)
- Experience building event sourcing solutions
- Familiarity with platforms for change data capture (e.g. Debezium)
- Knowledge of Azure cloud native solutions (e.g. Azure Data Factory, Azure Function App, Azure Container Instances)
Our offer – professional development, personal growth:
- Flexible employment and remote work
- International projects with leading global clients
- International business trips
- Non-corporate atmosphere
- Language classes
- Internal & external training
- Private healthcare and insurance
- Multisport card
- Well-being initiatives
Position at: Software Mind Poland
This role requires candidates to be based in Poland.