Architect enterprise-scale data solutions using Azure and Databricks for clinical trial data, enhancing analytics and compliance through advanced data governance and ML workflows.
We are seeking a Data Solutions Architect with 15+ years of experience in designing enterprise-scale data platforms. This role focuses on building Azure + Databricks Lakehouse solutions for clinical trial and life sciences data, enabling advanced analytics, machine learning workflows, and regulatory compliance.
Responsibilities
Architect Lakehouse solutions leveraging Azure Data Lake (ADLS Gen2), Databricks, and Delta Lake.
Design data models (star, snowflake, data vault), ingestion pipelines, and CDC strategies with schema evolution and performance tuning.
Implement data governance, security, and compliance aligned with GxP, HIPAA, and 21 CFR Part 11.
Enable data science and ML workflows using MLflow, Feature Store, and curated datasets.
Collaborate with clinical operations and biometrics teams to deliver business-aligned solutions.
Experience
15+ years in data architecture/engineering; 5+ years with Azure; 3+ years with Databricks.
Azure Expertise: ADLS Gen2, Data Factory/Fabric pipelines, Synapse/SQL, Event Hubs, Functions, Key Vault, Private Endpoints, VNets.
Databricks Expertise: Spark (PySpark/SQL), Unity Catalog, Delta Live Tables (DLT), Workflows, MLflow, Feature Store.
Data Modeling: Star/snowflake, data vault, CDC, schema evolution, performance tuning.
Programming: PySpark, SQL; bonus: Python (pandas), Scala, dbt.
Governance & Security: IAM/RBAC/ABAC, row/column-level security, encryption, masking/tokenization, secrets, audit.
Observability & Reliability: Monitoring, lineage, alerting, CI/CD (GitHub Actions/Azure DevOps), automated testing/validation.
Skills
Clinical trial data standards (CDISC: CDASH, SDTM, ADaM) and systems (EDC, CTMS, IRT).
Familiarity with decentralized trials, real-world data (RWD/RWE), and regulatory compliance frameworks
Strong stakeholder management and communication skills.
Ability to translate complex technical concepts into business value.
Leadership in cross-functional teams and mentoring engineers.
Education
Bachelor’s/Master’s in Computer Science, Data Engineering, or related field.
Certifications: Azure Data Engineer (DP-203), Databricks Architect, Azure Solutions Architect (AZ-305), GxP/CSV.