Senior Data Engineer : Data Lake (Remote)

AI overview

Focus on building and operating data platform components while improving API services, data quality frameworks, and LLM-powered tools to enhance platform reliability.

About us

Constructor is the next-generation platform for search and discovery in ecommerce, built to explicitly optimize for metrics like revenue, conversion rate, and profit. Our search engine is entirely invented in-house utilizing transformers and generative LLMs, and we use its core and personalization capabilities to power everything from search itself to recommendations to shopping agents. Engineering is by far our largest department, and we’ve built our proprietary engine to be the best on the market, having never lost an AB test to a competitive technology. We’re passionate about maintaining this and work on the bleeding edge of AI to do so.

Out of necessity, our engine is built for extreme scale and powers over 1 billion queries every day across X languages and with customers based out of Y countries. It is used by some of the biggest ecommerce companies in the world like Sephora, Under Armour, and Petco.

We’re a passionate team who love solving problems and want to make our customers’ and coworkers’ lives better. We value empathy, openness, curiosity, continuous improvement, and are excited by metrics that matter. We believe that empowering everyone in a company to do what they do best can lead to great things.

Constructor is a U.S. based company that has been in the market since 2019. It was founded by Eli Finkelshteyn and Dan McCormick who still lead the company today.

Job Description

The Constructor Data Platform is a foundational component for all internal data and ML teams. It handles the ingestion of over 2 TB of compressed events daily and manages over 6 PB of data in our data lake. 

The Data Platform:

  • Is a comprehensive set of tools and infrastructure used daily by every data scientist and ML engineer in our company.
  • Implements public-facing APIs for event ingestion (FastAPI) and real-time analytics (ClickHouse, Cube).
  • Manages data storage in appropriate formats (S3, ClickHouse, Delta).
  • Facilitates data processing using technologies such as Python, Spark/Databricks, ClickHouse, AWS Lambda, and Kinesis.
  • Includes robust monitoring solutions (Prometheus, OpenTelemetry, PagerDuty, Sentry).
  • Ensures automated testing of pipelines and data quality.
  • Provides cost observability and optimization capabilities.
  • Offers comprehensive tools for developers to develop, run, test, and schedule data pipelines, along with all necessary support and documentation.

Our platform is developed by the Data Lake Team and the Data Infrastructure Team.

About the Data Lake Team

We're hiring a Senior Data Engineer to work on our Data Lake Team. Here is what we doing day to day:

  • Maintain data pipeline job framework 
  • Develop Data Quality framework ( internal set of tools for internal and external data sources validation )
  • Maintain and develop public facing data ingestion service with 17 000+ RPS.
  • Maintain and develop core data pipelines in batch and streaming manners.
  • Be a last line of support for our internal platform users.
  • Take a part in an on-call rotation for data platform incidents (shared across the team).

Requirements

  • Fluent English
  • 4+ years building production services and data pipelines (batch and/or streaming)
  • Strong experience with Python or the readiness to ramp up quickly.
  • Hands-on experience with at least one MPP system (Spark, Trino, Redshift etc.)
  • Hands-on experience operating services in a cloud environment (AWS preferred)

Nice to have

  • Terraform/CloudFormation or other IaC tools
  • ClickHouse or similar analytical databases
  • Experiences with data quality/observability tools

Your primary focus will be on building and operating various data platform components (data quality, data pipelines, infrastructure, monitoring), with opportunities to contribute to API services and LLM-powered analytics tools. You’ll work closely with data scientists, ML engineers, and analytics teams to understand their needs, gather feedback, and improve platform reliability and usability.  Here are some of the projects you may be involved with:

  • Adopt configuration of Data Platform through IaC using terraform.
  • Take part in the development of the Data Quality framework and drive its adoption in the company. 
  • Improve BI self-service through LLM powered tools.
  • Migrate batch workloads to streaming solutions to ensure data is delivered in a timely manner.

Benefits

  • 🏝️ Unlimited vacation time - we strongly encourage all employees to take at least 3 weeks per year
  • 🌎 Fully remote team - choose where you live
  • 🛋️ Work from home stipend - we want you to have the resources you need to set up your home office
  • 💻 Apple laptops provided for new employees
  • 🧑‍🎓 Training and development budget - refreshed each year for every employee
  • 👪 Maternity & Paternity leave for qualified employees
  • 🧠 Work with smart people who will help you grow and make a meaningful impact
  • 💵 Base salary: $80k–$120k USD, depending on knowledge, skills, experience, and interview results
  • 📈 Stock options - offered in addition to the base salary
  • 🎉 Regular team offsites to connect and collaborate

Diversity, Equity, and Inclusion at Constructor

At Constructor.io we are committed to cultivating a work environment that is diverse, equitable, and inclusive. As an equal opportunity employer, we welcome individuals of all backgrounds and provide equal opportunities to all applicants regardless of their education, diversity of opinion, race, color, religion, gender, gender expression, sexual orientation, national origin, genetics, disability, age, veteran status or affiliation in any other protected group.

Studies have shown that women and people of color may be less likely to apply for jobs unless they meet every one of the qualifications listed. Our primary interest is in finding the best candidate for the job. We encourage you to apply even if you don’t meet all of our listed qualifications.

Perks & Benefits Extracted with AI

  • Home Office Stipend: Work from home stipend - we want you to have the resources you need to set up your home office
  • Learning Budget: Training and development budget - refreshed each year for every employee
  • Regular team offsites: Regular team offsites to connect and collaborate
  • Paid Parental Leave: Maternity & Paternity leave for qualified employees
  • Paid Time Off: Unlimited vacation time - we strongly encourage all employees to take at least 3 weeks per year
  • Remote-Friendly: Fully remote team - choose where you live

Constructor Search improves ecommerce revenue and relevance. Deliver superior onsite and in-app search experiences with AI, NLP, data and personalization. AI-first site search that increases conversions and revenue. Constructor.io

View all jobs
Salary
$80,000 – $120,000 per year
Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Senior Data Engineer Q&A's
Report this job
Apply for this job