Join a well-funded, cutting-edge hardware startup in Silicon Valley as an Accelerator Microarchitecture Performance Modeling Engineer.
Responsibilities and opportunities in this role include - functional and cycle-accurate simulator development, architectural and microarchitectural design-space exploration for programmable accelerators, as well as analysis and optimization of modern, highly-parallel applications.
Our mission is to reimagine silicon and create accelerated computing platforms that will transform the industry. You will have the opportunity to work with some of the most talented and passionate engineers in the world to create designs that push the envelope on performance, energy-efficiency, programmability and scalability.
You will also have the opportunity to explore many adjacent areas of research and engineering, cross-cutting many levels of abstraction that must be scaled when building computing machinery - ISA design, application software, compiler optimization, RTL design, RTL correlation, design verification, test writing, and power/area analysis.
We offer a fun, creative, collaborative and flexible work environment, where you can contribute to our vision of building server-class compute machines that fulfill the promise and potential of hardware-software co-design, while also learning every day.
Requirements
- In-depth knowledge of CPU/GPU Computer Architecture and Microarchitecture.
- Excellent coding skills in C/C++ languages
- Strong understanding of workloads and benchmarks in the Machine Learning space
- Solid appreciation for the basics of SIMT processing, cache and memory hierarchies
- Knowledge of performance modeling concepts - analytical, functional and cycle-accurate modelingKnowledge of performance improvement concepts - bottleneck analysis, latency hiding, speculative execution, shared resource arbitration, scheduling, buffer sizing, replacement policies
- Ability to work well in a team, take ownership of tasks, embrace aggressive schedules, be self motivated to learn, seek help, think clearly and communicate effectively
Responsibilities
- Performance modeling - develop functional and timing simulators in C++ modeling the programmable processing cores in a Data Parallel Accelerator.
- Performance analysis - configure and use the simulator to explore the architectural and microarchitectural design space.
- Design Space Exploration - influence the design choices based on experiments and studies
- Performance testing - develop tests to evaluate quality of model and RTL design
- Performance debug - identify and fix performance bottlenecks in tests/workloads/simulator
- Performance correlation - identify correct performance targets for tests/workloads and ensure that the RTL design meets that target
- Workload analysis - develop a deep understanding of the characteristics of workloads in the target market - machine learning, data analytics, graph analytics
Education and Experience
- Bachelor’s degree with 2-4 years of experience in a relevant field
- Master’s degree with 1-2 years of experience in a relevant field
- PhD with internship experience in a relevant field