Data Engineer

AI overview

Design and maintain reliable data pipelines with a focus on scalability and quality, ensuring clean data drives decision-making across the business.

Job Title: Data Engineer
Position Type: Full-Time, Remote
Working Hours: U.S. client business hours (with flexibility for pipeline monitoring and data refresh cycles)

About the Role:
Our client is seeking a Data Engineer to design, build, and maintain reliable data pipelines and infrastructure that deliver clean, accessible, and actionable data. This role requires strong software engineering fundamentals, experience with modern data stacks, and an eye for quality and scalability. The Data Engineer ensures data flows seamlessly from source systems to warehouses and BI tools, powering decision-making across the business.

Responsibilities:
Pipeline Development:

  • Build and maintain ETL/ELT pipelines using Python, SQL, or Scala.
  • Orchestrate workflows with Airflow, Prefect, Dagster, or Luigi.
  • Ingest structured and unstructured data from APIs, SaaS platforms, relational databases, and streaming sources.

Data Warehousing:

  • Manage data warehouses (Snowflake, BigQuery, Redshift).
  • Design schemas (star/snowflake) optimized for analytics.
  • Implement partitioning, clustering, and query performance tuning.

Data Quality & Governance:

  • Implement validation checks, anomaly detection, and logging for data integrity.
  • Enforce naming conventions, lineage tracking, and documentation (dbt, Great Expectations).
  • Maintain compliance with GDPR, HIPAA, or industry-specific regulations.

Streaming & Real-Time Data:

  • Develop and monitor streaming pipelines with Kafka, Kinesis, or Pub/Sub.
  • Ensure low-latency ingestion for time-sensitive use cases.

Collaboration:

  • Partner with analysts and data scientists to provide curated, reliable datasets.
  • Support BI teams in building dashboards (Tableau, Looker, Power BI).
  • Document data models and pipelines for knowledge transfer.

Infrastructure & DevOps:

  • Containerize data services with Docker and orchestrate in Kubernetes.
  • Automate deployments via CI/CD pipelines (GitHub Actions, Jenkins, GitLab CI).
  • Manage cloud infrastructure using Terraform or CloudFormation.

What Makes You a Perfect Fit:

  • Passion for clean, reliable, and scalable data.
  • Strong problem-solving skills with debugging mindset.
  • Balance of software engineering rigor and data intuition.
  • Collaborative communicator who thrives in cross-functional environments.

Required Experience & Skills (Minimum):

  • 3+ years in data engineering or back-end development.
  • Strong Python and SQL skills.
  • Experience with at least one major data warehouse (Snowflake, Redshift, BigQuery).
  • Familiarity with pipeline orchestration tools (Airflow, Prefect).

Ideal Experience & Skills:

  • Experience with dbt for transformations and data modeling.
  • Streaming data experience (Kafka, Kinesis, Pub/Sub).
  • Cloud-native data platforms (AWS Glue, GCP Dataflow, Azure Data Factory).
  • Background in regulated industries (healthcare, finance) with strict compliance.

What Does a Typical Day Look Like?
A Data Engineer’s day revolves around keeping pipelines running, improving reliability, and enabling teams with high-quality data. You will:

  • Check pipeline health in Airflow/Prefect and resolve any failed jobs.
  • Ingest new data sources, writing connectors for APIs or SaaS platforms.
  • Optimize SQL queries and warehouse performance to reduce costs and latency.
  • Collaborate with analysts/data scientists to deliver clean datasets for dashboards and models.
  • Implement validation checks to prevent downstream reporting issues.
  • Document and monitor pipelines so they’re reproducible, scalable, and audit-ready.
    In essence: you ensure the business has accurate, timely, and trustworthy data powering every decision.

Key Metrics for Success (KPIs):

  • Pipeline uptime ≥ 99%.
  • Data freshness within agreed SLAs (hourly, daily, weekly).
  • Zero critical data quality errors reaching BI/analytics.
  • Cost-optimized queries and warehouse performance.
  • Positive feedback from data consumers (analysts, scientists, leadership).

Interview Process:

  • Initial Phone Screen
  • Video Interview with Pavago Recruiter
  • Technical Task (e.g., build a small ETL pipeline or optimize a SQL query)
  • Client Interview with Engineering/Data Team
  • Offer & Background Verification

Pavago - Connecting You to Global Remote Opportunities 🌍At Pavago, we redefine the boundaries of talent recruitment. Dive into a world where your geographical location doesn't restrict your career aspirations. As a distinguished international recruitment agency, we specialize in connecting remote talents with companies eager to tap into global expertise.🌟 Why Consider Opportunities Through Pavago?Competitive Pay: Command the salary you deserve, regardless of where you reside.Broad Horizons: Unlock a wide array of remote positions spanning diverse industries and regions.Skill Enrichment: Work alongside international teams, contribute your unique insights, and amplify your career trajectory.Whether you're a seasoned professional hunting for a novel global venture or a budding talent keen on leaving an international imprint, Pavago is your conduit to businesses that appreciate and seek out worldwide perspectives.Embrace a realm where opportunities transcend borders. Together, let's pioneer the next era of remote work. 🚀Explore global opportunities with us today!"

View all jobs
Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Data Engineer Q&A's
Report this job
Apply for this job