Culver City, United States

Overview:

Spotter is a platform for Creators, providing services and software designed to accelerate growth for the world’s best Creators and brands. Creators working with Spotter can access the capital, knowledge, community, and personalized AI software products they need to succeed. With unique knowledge of how Creators work, the resources they need to grow, and the challenges they face, Spotter is empowering top YouTube Creators to succeed.

Spotter has already deployed over $940 million to YouTube Creators to reinvest in themselves and accelerate their growth, with plans to reach $1 billion in investment by 2024. With a premium catalog that spans over 725,000 videos, Spotter generates more than 88 billion monthly watch-time minutes, delivering a unique scaled media solution to Advertisers and Ad Agencies that is transparent, efficient, and 100% brand safe. For more information about Spotter, please visit https://spotter.com.

OVERVIEW

The successful candidate will be responsible for processing huge data sets (billions of records) using distributed data processing frameworks (Apache Spark, etc...).

Must have:

Extensive experience working with very large data sets, creating performant & scalable ETL pipelines using Spark
In-depth understanding of performance bottlenecks in large-scale data processing

What You’ll Do:

Are you ready to help lead the charge in shaping the data-driven future of Spotter? We're in search of an exceptional Principal Data Engineer who will play a pivotal role in designing, building, and optimizing scalable data infrastructure. You will help us with data pipelines for acquisition and transformation of large datasets, storage and querying optimizations of varying data to support a large range of use cases from Analytics to Creator Products to Operations using traditional and ML focused access patterns. You will be a key player in empowering us to make data-informed decisions that will fuel our innovation and growth.

Develop and maintain scalable data pipelines, including:

ETL pipelines, both single and multi-node solutions
Build data quality assurance steps for new and existing pipelines
Create derived datasets with augmented properties
Work on analytics ready datasets to power internal and creator facing tools
Troubleshoot issues when they arise, working directly with internal data consumers
Automate pipeline runs with scheduling and orchestration tools

Work with large scale datasets
Work with/use various external APIs to enhance data
Setup database tables for analytics users to consume the data collected by the Data Engineering team
Work with big data technologies to improve data availability and data quality in the cloud (AWS)
Lead development of projects involving other team members and act as a mentor
Actively participate in team discussions about technology/architecture/solutions for new projects and to improve existing code and pipeline

Who You Are:

10+ years of Data Engineering experience. Ideally also have 2-4 years software engineering
5+ years experience with Apache Spark or Apache Flink
4+ years of experience running software and services in the cloud
Proficiency in working with DataFrame APIs (Pandas and Spark) for parallel and single node processing
Proficiency using advanced languages and techniques with Python, Scala, etc. with modern data optimized file formats such as Parquet and Avro
Proficiency with SQL on RDBMS and data warehouse solutions like Redshift
Hands on experience with Data Lake technologies like Delta Lake and Iceberg
Experience with data acquisition from external APIs at large scale / in parallel processing
Experience supporting ML/AI projects: deployed pipelines for computing features, using models for inference on large datasets
Bachelor’s degree (OR equivalent work experience), preferably in Computer Science related field

Additional Valued Skills:

Experience with YouTube APIs
Experience with AWS Glue metastore
Experience with Data-Mesh approaches
Experience with data cataloging, data lineage and data governance tools and approaches
Experience with vector databases

Why Spotter:

Medical and vision insurance covered up to 100%
Dental insurance
401(k) matching
Stock options
Complimentary gym access
Autonomy and upward mobility
Diverse, equitable, and inclusive culture, where your voice matters.

In compliance with local law, we are disclosing the compensation, or a range thereof, for roles that will be performed in Culver City. Actual salaries will vary and may be above or below the range based on various factors including but not limited to skill sets; experience and training; licensure and certifications; and other business and organizational needs. The overall market range for roles in this area of Spotter are typically: $200K-$215K salary per year. The range listed is just one component of Spotter’s total compensation package for employees. Other rewards may include annual discretionary bonus and equity.

Spotter is an equal opportunity employer. Spotter does not discriminate in employment on the basis of race, religion, creed, color, national origin, ancestry, citizenship, physical or mental disability, medical condition, genetic characteristics or information, marital status, sex (including pregnancy, childbirth, breastfeeding, and related medical conditions), gender, gender identity, gender expression, age, sexual orientation, military status, veteran status, use of or request for family or medical leave, political affiliation, or any other status protected under applicable federal, state or local laws.

Equal access to programs, services and employment is available to all persons. Those applicants requiring reasonable accommodations as part of the application and/or interview process should notify a representative of the Human Resources Department.

Apply for this job

Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!

Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

Principal Data Engineer Q&A's

Report this job