Senior Infrastructure Engineer (Backend/Data Performance)

AI overview

Design and optimize scalable data pipelines to enhance the Earth Species Project's mission in understanding animal communication using advanced AI.
About Earth Species Project Earth Species Project (ESP) is a non-profit using AI to decode animal communication to find new ways of listening to and learning from the rest of nature. Our ultimate goal is to drive a renewed relationship with the rest of nature that allows the diversity of life to thrive.  ESP partners with biologists and machine learning researchers at leading universities and institutions around the world, and we are honored to be supported by many forward-looking philanthropists and groups, including Reid Hoffman, Waverley Street Foundation, the Paul G Allen Family Foundation and the National Geographic Society. About the role We're looking for a Senior Infrastructure Engineer (Backend/Data Performance) to help us build the foundational pipelines that power Earth Species Project's mission to understand animal communication with advanced AI. You’ll design and optimize scalable systems that let our researchers experiment faster, with production-quality reliability. Your work will focus on data infrastructure and backend performance, creating pipelines and storage layers that can handle diverse species data at scale. You’ll collaborate closely with researchers, engineers, and external partners to make complex AI workflows simple, efficient, and reliable. In this role you will
  • Design and optimize high-performance data pipelines for distributed training and storage (using tools like Arrow, DuckDB, LanceDB, BigQuery, vector databases).
  • Focus on low-level optimizations (latency, throughput, reliability, GPU usage).
  • Build monitoring and visualization tools for tracking data quality, pipeline performance, and experiments.
  • Optimize distributed AI workloads for reliability, latency, and efficiency.
  • Scope and supervise projects so that interns, PhD students, and post-docs can contribute and collaborate effectively.
  • Support recruiting efforts and help shape the growth of the infrastructure team.
  • Your background looks something like
  • 5+ years of backend or infrastructure engineering experience
  • Strong Python programming skills (bonus points for lower-level languages)
  • Experience with distributed systems and cloud platforms (AWS, GCP, Azure)
  • Hands-on experience with containerization (Docker, Kubernetes) and infrastructure as code (Terraform)
  • Experience building or supporting ML/AI infrastructure in production
  • Experience with high-performance data tools (DuckDB, Apache Spark, Delta Lake)
  • GPU orchestration and large-scale model training experienceFamiliarity with ML platforms (SageMaker, Vertex AI) and frameworks (PyTorch, JAX)
  • Experience mentoring junior engineers, interns, or researchers and breaking down complex projects into manageable tasks
  • Experience participating in technical hiring processes and evaluating candidates
  • It would be even better if you
  • Have deep knowledge of training architectures, CUDA programming, or TPU optimization
  • Have Full-stack development experience with frameworks like
  • React for building web applications
  • Experience managing HPC infrastructure with tools like Slurm or Kubernetes clusters
  • Background in monitoring stacks (Prometheus, Grafana) for ML pipeline observability
  • About the hiring process
  • Initial interview: 30-minute discussion to align on experience and expectations
  • Technical screening: Two interviews and a take home exercise covering coding and system design
  • Panel interview: Assess team alignment
  • Final interview: Conversation with our Chief Scientist
  • ESP is committed to equal employment opportunities regardless of race, color, religion, gender, gender identity or expression, pregnancy, sexual orientation, marital status, ancestry, national origin, genetics, disability, age, veteran status, and criminal history, consistent with legal requirements. We encourage folks of all backgrounds and perspectives to apply.

    If you require any accommodations, please email us at [email protected], and we’ll work with you to meet your accessibility needs.

    Perks & Benefits Extracted with AI

    • Flexible Work Hours: Flexible working hours
    • Home Office Stipend: 2,000 USD home office stipend
    • Team retreats: Regular team retreats around the world
    • Paid Time Off: Unlimited paid time off, with a recommended minimum of three weeks per year

    An open-source collaborative and nonprofit dedicated to decoding animal communication. We are motivated by the recent monumental milestone in machine learning: the invention of techniques that can translate languages without dictionaries. We believe th...

    View all jobs
    Salary
    $225,500 – $235,500 per year
    Ace your job interview

    Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

    Senior Infrastructure Engineer Q&A's
    Report this job
    Apply for this job