Neural Magic is hiring a

DevOps and IT Admin Engineer

Somerville, United States
Full-Time

About Neural Magic

Based in Somerville, Massachusetts, Neural Magic is a series A startup backed by leading investors including Andreessen Horowitz, NEA, NEA, Pillar, VMware, Verizon Ventures, Comcast Ventures, and Amdocs. At Neural Magic we believe the future of AI is open and we are on a mission to bring the power of open-source LLMs and VLLM to every enterprise on the planet. Neural Magic accelerates AI for the enterprise and brings operational simplicity to GenAI deployments. As a leading developer and maintainer of the vLLM project and inventor of state-of-the-art techniques for model quantization and sparsification, Neural Magic provides a stable platform for enterprises to build, optimize and scale LLM deployments.

Our Mission

Neural Magic is on a mission to bring the power of open-source LLMs and vLLM to every enterprise on the planet.

Your Role

As a DevOps and IT Admin Engineer, you will manage and scale our Kubernetes infrastructure, cloud offerings, and network storage. This role involves hands-on data center tasks, including troubleshooting hardware failures, racking new servers, and managing internal networking and VPN. You will collaborate with ML Ops engineers, ML researchers, and the engineering team to support research training runs, performance benchmarking, and CI/CD. Additionally, you will contribute to the product roadmap by providing insights on scaling inference serving loads using vLLM, Kubernetes, Helm charts, and other technologies.

Join us in shaping the future of AI!

Responsibilities

  • Kubernetes Management: Oversee and improve our Kubernetes infrastructure, ensuring optimal performance and scalability.
  • Cloud Infrastructure: Manage cloud offerings across multiple regions, ensuring fast access and reliability.
  • Network Storage: Maintain and enhance our network storage solutions, ensuring data integrity and availability.
  • Data Center Operations: Troubleshoot hardware failures, rack new servers, and manage internal networking infrastructure and VPN.
  • Collaboration: Work closely with ML Ops engineers, ML researchers, and other engineering team members to support scalable research training runs, performance benchmarking, and CI/CD.
  • Product Roadmap Contribution: Provide insights and opinions on the product roadmap, focusing on scaling inference serving loads through vLLM, Kubernetes, and Helm charts.
  • Performance Monitoring: Implement monitoring solutions to ensure the health and performance of all infrastructure components.

Requirements

  • Experience: 5 or more years of experience in DevOps, IT administration, or a related field.
  • Kubernetes Proficiency: Strong understanding and hands-on experience with Kubernetes.
  • Cloud Services: Experience with cloud platforms (e.g., AWS, GCP, Azure) and multi-region deployments.
  • Networking Skills: Solid knowledge of network infrastructure, VPN setup, and network troubleshooting.
  • Hardware Management: Experience in managing physical servers, including racking, maintenance, and troubleshooting.
  • Collaboration Skills: Ability to work effectively within a team and collaborate with cross-functional teams, including ML Ops, researchers, and engineering.
  • Problem-Solving: Strong analytical and problem-solving skills with a proactive approach to addressing challenges.
  • Technological Insight: Ability to inform and influence the product roadmap with technical insights and recommendations.

Benefits

  • Competitive compensation and stock option plan
  • Comprehensive health care (medical, dental, vision)
  • Retirement plan (401k, IRA)
  • Generous paid time off (vacation, sick leave, holidays)
  • Family leave (maternity, paternity)
  • Disability coverage
  • Professional development opportunities
  • Flexible work arrangements
  • Wellness resources
  • Free food and snacks (in the office)

Neural Magic is an equal-opportunity employer committed to fostering a diverse and inclusive workplace. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.

Apply for this job

Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!

Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Ace your job interview

Understand the required skills and qualifications, anticipate the questions you may be asked, and study well-prepared answers using our sample responses.

IT Administrator Q&A's
Report this job
Apply for this job