Position Overview

We seek a results-oriented Data Engineer with a minimum of 2+ years of experience in data pipeline development within cloud environments. The successful candidate shall be responsible for designing, constructing, and optimizing Azure-based data ingestion and transformation pipelines using PySpark and Spark SQL. This role requires collaboration with cross-functional teams to deliver high-quality, reliable, and scalable data solutions.

Duties and Responsibilities

Design, develop, and maintain high-performance ETL/ELT pipelines using PySpark and Spark SQL.
Build and orchestrate data workflows in AZURE.
Implement hybrid data integration between on-premise databases and Azure Databricks using tools such as ADF, HVR/Fivetran, and secure network configurations.
Enhance/optimize Spark jobs for performance, scalability, and cost efficiency.
Implement and enforce best practices for data quality, governance, and documentation.
Collaborate with data analysts, data scientists, and business users to define and refine data requirements.
Support CI/CD processes and automation tools and version control systems like Git.
Perform root cause analysis, troubleshoot issues, and ensure the reliability of data pipelines.

Required Qualifications

Bachelor's degree in Computer Science, Engineering, or related field.
2+ years of hands-on experience in data engineering.
Proficiency in PySpark, Spark SQL, and distributed processing.
Strong knowledge of Azure cloud services including ADF, Databricks, and ADLS.
Experience with SQL, data modeling, and performance tuning.
Familiarity with Git, CI/CD pipelines, and agile practices.

Preferred Qualifications

Experience with orchestration tools such as Airflow or ADF pipelines.
Knowledge of real-time streaming tools (Kafka, Event Hub, HVR).
Exposure to APIs, data integrations, and cloud-native architectures.
Familiarity with enterprise data ecosystems

Data Engineer for AZURE cloud Platform

TLDR