How to Apply
Applicants are required to submit both a resume and a cover letter in PDF format for consideration. Within the cover letter, please provide responses to the following questions:
Question 1: How many HPC clusters have you deployed, how did you deploy them, what hardware and software was included?
Question 2: How did you develop your knowledge and skills in HPC?
Question 3: What is your preferred method/software for cluster provisioning?
Only candidates who provide complete application materials, including responses to the above questions, will be considered.
Job Overview:
The Cambridge HPC AI Technologist is a field-based consultant that builds end-to-end research computing system solutions. You will leverage your expertise in scientific computing and knowledge of the technology landscape to drive outcomes that exceed client expectations.
Responsibilities and Duties:
Gather client requirements, design optimized solutions, sometimes using a single vendor's portfolio and more often using a broad variety of vendors and technologies.
Deploy a new solution or augment an existing HPC/AI solution from the ground up.
Consult on and assist with day-to-day management of clients’ research compute infrastructure environments.
Maintain HPC/AI infrastructure in Linux-based environments for new and existing clients.
Lead technical discussions and be the face to the client in preparation for and during engagements.
Validate solution designs, meet client requirements, and are technically feasible and deployable.
Ensure solutions are simple and easy to understand while taking into account the client’s overall capabilities / skills.
Scope out and detail professional services deliverables setting clear client expectations.
Build documentation and provide knowledge transfer required for clients to support their environments.
Display expertise in storage, networking, data protection, digital archiving, and other infrastructure technologies.
Gain advanced expertise of and certifications from the vendors Cambridge uses in our solution stack.
Qualifications:
Candidates must have at least 5+ years providing deployment services or cluster administration.
University undergraduate degree in Computer Science, Computer Engineering, or science related field required.
Candidates must also display solid knowledge of GPU-focused hardware/ software and Linux system administration (package management, IP networking, troubleshooting etc.). They must also have solid fundamentals in cluster design / management technologies (Bright, Werewolf, XCat etc.), a background with storage technologies and parallel filesystems (Lustre, GPFS, BeeGFS etc.), experience with networking and configuring network switches (ethernet and InfiniBand), acquaintance with HPC schedulers (SLURM, UGE, LSF, etc.) and programming / libraries (MPI, CUDA, etc.), and proficiency with Scripting (Bash, Python, etc.).
Have hands-on working knowledge of tech industry leaders including AMD, DDN, Dell, HPE, IBM, Intel, Juniper, Lenovo, Microsoft, NVIDIA, Oracle, Vast, VMWare, WEKA, and others.
As this is a field-based role, the employee must be able to work remotely, independently, and unsupervised. Travel will be approximately 50% of the time which includes short day trips.
Candidates must have impeccable communication skills, an ability to multitask, and high attention to detail. They must be effective problem solvers, organized, creative, intellectually curious, deal with ambiguity, and able to work with different types of personalities.
Authorization to work in the United States on a full-time basis required.
Please mention you found this job on AI Jobs. It helps us get more startups to hire on our site. Thanks and good luck!