Data Integration Engineer

AI overview

Ensure data quality and reliability across the ETL pipeline, collaborate with machine learning teams, and become a domain expert in connected vehicle data.

At Viaduct, we use patented AI to discover hidden patterns in complex time series data – so manufacturers and operators of connected equipment can deliver transformative business results from their data, fast.  On our platform we deliver solutions across the equipment lifecycle, from manufacturing productivity, manufacturing quality, service operations and fleet management. 

Who You Are

You are a thoughtful engineer. You understand the complexities of distributed systems and how to triage and solve issues that arise with them. Scalability is top of mind when designing any system or writing code. You believe building a better ETL system requires close collaboration with the machine learning and data science teams.

About the Role

As an data engineer at Viaduct, an analytics and ML platform company, your work is critical to our success. You are responsible for ensuring data quality and reliability in every part of our ETL pipeline, from ingestion to client integrations. 

Responsibilities

  • Creating and supporting batch, incremental, and real-time  ETL pipelines
  • Standardizing ingestion, validation, and cleaning processes across clients
  • Automating data validation to increase data quality
  • Managing and evolving schemas in all parts of our pipeline
  • Monitoring, tuning, and optimizing Spark jobs
  • Becoming a domain-expert in connected vehicle data

About You

  • 4+ years as a data engineer
  • Experience as a tech lead or mentor
  • Proficiency in Python/Scala/Java/C++ and SQL
  • 3+ years of experience with Spark or equivalent technologies
  • 2+ years of experience with a workflow scheduler (Airflow, Prefect, Argo, etc) 
  • 2+ years of experience with distributed file-systems (HDFS, S3, etc)
  • Familiar with the tools in open-source data ecosystem (Apache, CNCF, etc)
  • Experience with incremental or real-time processing (Delta Lake, Apache Hudi, Kafka Stream, Spark Streaming, etc) 

Security and Privacy Responsibilities

  • Follow our policy and procedure documents related to security and privacy
  • Follow the security and privacy guidelines in the Employee Handbook
  • Participate in new hire and annual training for security and privacy
  • Treat data security and privacy as one of your primary job responsibilities
  • Report Security Incidents you discover as bugs
  • Get approval from the Security Team before adding new 3rd party software to our codebase
  • Explicitly consider security implications when doing PR reviews

Bonus

  • Experience with Kubernetes
  • Experience working with ML teams
  • Contributor to open source projects
  • Experience in the Automotive industry or a love of cars
  • Prior work in small, agile teams

 

Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Integration Engineer Q&A's
Report this job
Apply for this job