We are seeking an experienced Azure Data Engineer to join our enterprise data engineering team. This role is focused on building and maintaining modern, scalable data pipelines across our data ecosystem — including lakehouses, data warehouses, data marts, and operational data stores — while supporting the migration of legacy ETL solutions to Microsoft Fabric and Azure.
Key Responsibilities:
Data Pipeline Development
- Design and build ETL/ELT pipelines using Azure Data Factory, Microsoft Fabric Data Pipelines, Databricks, and Fabric Notebooks
- Implement medallion architecture (Bronze/Silver/Gold) in Fabric Lakehouse environments
- Develop transformation logic using T-SQL, Spark SQL, PySpark, and Dataflows Gen2
- Build and maintain dimensional models (star/snowflake schema) and Data Vault models
- Implement incremental loading patterns using CDC, watermarking, and delta detection
- Create reusable pipeline components, templates, and parameterized frameworks
- Optimize pipeline performance through partitioning, parallelization, and query tuning
Legacy-to-Fabric Migration
- Convert legacy ETL mappings, workflows, and scheduling logic to Microsoft Fabric/ADF equivalents
- Recreate parameter files, session configurations, and orchestration patterns in Fabric
- Execute unit testing and data reconciliation to validate migrated pipelines produce identical results
- Document conversion patterns, technical decisions, and issue resolutions
- Support parallel runs and cutover validation
Data Quality & Testing
- Build data quality checks and validation frameworks embedded within pipelines
- Develop automated testing strategies (unit, integration, regression) for data pipelines
- Create monitoring dashboards and alerting for pipeline failures and data anomalies
- Perform source-to-target reconciliation for both BAU and migration workloads
Platform Operations & Collaboration
- Monitor, troubleshoot, and optimize production pipelines
- Implement logging, error handling, and retry mechanisms
- Support CI/CD pipelines for data solutions using Azure DevOps and Git
- Manage environment promotions (DEV → QA → PROD) and participate in on-call rotation
- Implement security best practices: RBAC, encryption, data masking, workspace security
- Collaborate with Data Architects, Business Analysts, DevOps, and BI teams
- Maintain technical documentation: pipeline specs, data dictionaries, and runbooks
Technical Skills:
Microsoft Fabric & Azure
- Microsoft Fabric — Lakehouse, Data Warehouse, Data Pipelines, Dataflows Gen2, Notebooks
- Azure Data Factory v2 — pipelines, linked services, integration runtimes, triggers
- Azure Synapse Analytics — Dedicated SQL Pools, Serverless SQL, Spark Pools
- Azure Data Lake Storage Gen2, OneLake, Shortcuts, and Direct Lake mode
SQL & Programming
- Expert-level T-SQL — stored procedures, complex queries, performance tuning
- Python for data processing and automation
- PySpark for large-scale data transformations
- Familiarity with JSON, XML, and REST APIs
Informatica Platform
- Development experience with Informatica PowerCenter (Designer, Workflow Manager, Workflow Monitor)
Data Platforms & Formats
- Delta Lake format and Delta table operations
- Apache Spark architecture and optimization
- Data partitioning strategies and performance tuning
- Parquet and Avro file formats
- Dimensional modeling and Data Vault concepts
DevOps & Governance
- Git version control and Azure DevOps (Repos, Pipelines)
- CI/CD implementation for data solutions
- Fabric workspace deployment pipelines
- Data lineage, metadata management, and data cataloging
- Security best practices — RBAC, encryption, masking
- Awareness of compliance standards (GDPR, HIPAA, SOC2)
Education & Experience
- Bachelor's degree in Computer Science, Information Technology, Engineering, or related field
- 6-8 years of hands-on experience in data engineering, ETL development, and data warehousing
- Minimum 6 months of hands-on experience working with Microsoft Fabric in a live/production environment (1 year+ preferred) - this includes practical delivery experience across Dataflow Gen2, PySpark Notebooks orchestrated by Fabric Pipelines, Fabric Data Pipelines (ADF-based), Delta table joins in Notebooks, PySpark aggregations, Notebook-based DQ with logging to Bronze
- 1+ years of experience developing solutions on Microsoft Azure data platform
- 1+ years of hands-on experience with Informatica PowerCenter and/or IICS development
- Experience participating in or leading ETL migration projects
- Strong understanding of data warehouse concepts, dimensional modeling, and data integration patterns
- Microsoft Certified: Azure Data Engineer Associate (DP-203)
- Microsoft Certified: Fabric Analytics Engineer Associate (DP-600)