All engineers and researchers are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.
Tech Stack
Python / Rust / C++
JAX and XLA
NCCL
CUDA (C++ and Triton)
Location
The role is based in the Bay Area [San Francisco and Palo Alto]. Candidates are expected to be located near the Bay Area or open to relocation.
Focus
Design, build, and implement large-scale distributed training systems.
Profiling, debugging, and optimizing multi-host GPU utilization.
Hardware / Software / Algorithm co-design.
Maintain and innovate on the codebase.
Build tools to boost the productivity of the team.