We are looking for a certified Data Engineer who will turn data into information, information into insight, and insight into business decisions. This is a unique opportunity to be one of the key drivers of change in our expanding company.
Work at Exadel - Who We Are
Since 1998, Exadel has been engineering its products and custom software for clients of all sizes. Headquartered in Walnut Creek, California, Exadel has 2,000+ employees in development centers across America, Europe, and Asia. Our people drive Exadel’s success and are at the core of our values.
About the Customer
The customer is a leading provider of software solutions and healthcare services. They provide hospitals and health systems and help those systems generate better data and insights to enable better healthcare.
The customer offers a robust portfolio of solutions that can be fully integrated and custom-configured to help healthcare organizations efficiently capture information for assessing and improving the quality and safety of patient care.
Requirements
- Background in Data Engineering with hands-on work using Databricks
- Strong expertise in ETL processes and data pipeline development
- Proficiency with Spark and Python for large-scale data processing in Databricks
- Experience with data extraction from web-based sources (APIs, web scraping, or similar approaches)
- Familiarity with handling structured and unstructured data and ensuring efficient data storage and access
- Competency in Azure or other cloud platforms
- Knowledge of SQL and database management for data validation and storage
English level
Intermediate+
Responsibilities
- Design, develop, and implement data ingestion and transformation pipelines using Databricks and Azure Databricks
- Manage and orchestrate data extraction from a public information website, handling bulk data downloads efficiently
- Develop solutions for periodic data updates (monthly), optimizing data refresh cycles to ensure accuracy and efficiency
- Clean, aggregate, and transform the data for analysis, ensuring the quality and completeness of data
- Collaborate with stakeholders to understand data requirements and propose solutions for efficient data management
- Implement best practices for ETL processes in a Databricks environment, ensuring scalability and performance
- Monitor and troubleshoot data pipelines, ensuring smooth operation and timely updates.
- Document the data engineering processes, workflows, and architecture