Senior Data Engineer

AI overview

Join a rapidly expanding team to architect and optimize complex data integration solutions in a modern cloud environment, driving impactful outcomes in healthcare.

About Us

Abacus Insights is transforming how data works for health plans. Our mission is simple: make healthcare data usable, so the people responsible for care and cost decisions can act faster, with confidence.  
We help health plans break down data silos to create a single, trusted data foundation. That foundation powers better decisions —so plans can improve outcomes, reduce waste, and deliver better experiences for members and providers alike.  

Backed by $100M from top investors, we’re tackling big challenges in an industry that’s ready for change.  Our platform enables GenAI use cases by delivering clean, connected, and reliable healthcare data that can support automation, prioritization, and decision workflows—and it’s why we are leading the way.

Our innovation begins with people. We are bold, curious, and collaborative—because the best ideas come from working together. Ready to make an impact? Join us and let's build the future together.

About the role 

We are seeking an accomplished Senior Data Engineer to join our dynamic and rapidly expanding Tech Ops division. With significant projected growth, this is an opportunity to drive meaningful technical impact. In this role, you will work with internal engineering teams to design, implement, and optimize complex data integration solutions within a modern, largescale cloud environment. 

You will leverage advanced skills in distributed computing, data architecture, and cloud-native engineering to enable scalable, resilient, and highperformance data ingestion and transformation pipelines. As a trusted technical advisor, you will ensure high-quality, compliant data operations across the lifecycle. 

Your day to day 

  • Architect, design, and implement high-volume batch and real-time data pipelines using PySpark, SparkSQL, Databricks Workflows, and distributed processing frameworks. 
  • Build endtoend ingestion frameworks integrating with Databricks, Snowflake, AWS services (S3, SQS, Lambda), and vendor data APIs, ensuring data quality, lineage, and schema evolution. 
  • Develop data modeling frameworks, including star/snowflake schemas and optimization techniques for analytical workloads on cloud data warehouses. 
  • Translate complex business requirements into detailed technical specifications, engineering artifacts, and reusable components. 
  • Establish and enforce data engineering best practices, such as CI/CD for data pipelines, code versioning, automated testing, orchestration, logging, and observability patterns. 
  • Conduct performance profiling and optimize compute costs, cluster configurations, partitions, indexing, and caching strategies across Databricks and Snowflake environments. 
  • Produce high-quality technical documentation including runbooks, architecture diagrams, and operational standards. 
  • Participate in planning, design discussions, technical reviews, and continuous improvement activities as part of an iterative development lifecycle. 
  • Proactively monitor, troubleshoot, and support production data pipelines to ensure reliability, performance, and timely data availability. 
  • Identify root causes of data pipeline issues and implement corrective and preventive actions to improve operational stability and data quality. 
  • Mentor junior engineers through technical reviews, coaching, and training sessions for both internal teams and clients. 

What you bring to the team 

  • Bachelor’s degree in Computer Science, Computer Engineering, or a closely related technical field. 
  • 5+ years of handson experience as a Data Engineer working with largescale, distributed data processing systems in modern cloud environments. 
  • Working knowledge of U.S. healthcare data domains—including claims, eligibility, and provider datasets—and experience applying this knowledge to complex ingestion and transformation workflows. 
  • Strong ability to communicate complex technical concepts clearly across both technical and nontechnical stakeholders. 
  • Expertlevel proficiency in Python, SQL, and PySpark, including developing distributed data transformations and performanceoptimized queries. 
  • Demonstrated experience designing, building, and operating productiongrade ETL/ELT pipelines using Databricks, Airflow, or similar orchestration and workflow automation tools. 
  • Proven experience architecting or operating largescale data platforms using dbt, Kafka, Delta Lake, and eventdriven/streaming architectures, within a cloudnative data services or platform engineering environment—requiring specialized knowledge of distributed systems, scalable data pipelines, and cloudscale data processing. 
  • Experience working with structured and semistructured data formats such as Parquet, ORC, JSON, and Avro, including schema evolution and optimization techniques. 
  • Strong working knowledge of AWS data ecosystem components—including S3, SQS, Lambda, Glue, IAM—or equivalent cloud technologies supporting highvolume data engineering workloads. 
  • Proficiency with Terraform, infrastructureascode methodologies, and modern CI/CD pipelines (e.g., GitLab) supporting automated deployment and versioning of data systems. 
  • Deep expertise in SQL and compute optimization strategies, including ZOrdering, clustering, partitioning, pruning, and caching for largescale analytical and operational workloads. 
  • Handson experience with major cloud data warehouse platforms such as Snowflake (preferred), BigQuery, or Redshift, including performance tuning and data modeling for analytical environments. 

What we would like to see but not required: 

  • Experience in large-scale healthcare or payer data environments. 

What you’ll get in return 

  • Competitive Leave & Benefits
  • Comprehensive health coverage
  • Equity for every employee – share in our success
  • Growth-focused environment – your development matters here

Working Arrangements

  • Standard hours: 9 hours/day, 5 working days
  • Location: Onsite
  • Shift: 10 AM – 7 PM local time

Our Commitment as an Equal Opportunity Employer

As a mission-led technology company helping to drive better healthcare outcomes, Abacus Insights believes that the best innovation and value we can bring to our customers comes from diverse ideas, thoughts, experiences, and perspectives. Therefore, we dedicate resources to building diverse teams and providing equal employment opportunities to all applicants. Abacus prohibits discrimination and harassment regarding race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state, or local laws.

At the heart of who we are is a commitment to continuously and intentionally building an inclusive culture—one that empowers every team member across the globe to do their best work and bring their authentic selves. We carry that same commitment into our hiring process, aiming to create an interview experience where you feel comfortable and confident showcasing your strengths. If there’s anything we can do to support that—big or small—please let us know.

Perks & Benefits Extracted with AI

  • Health Insurance: Comprehensive health coverage
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Senior Data Engineer Q&A's
Report this job
Apply for this job