Minimum Qualifications
- Bachelor's degree in Electrical Engineering, Computer Engineering, Computer Science, or a related field, or equivalent practical experience.
- 10 years of experience in computer architecture, chip architecture, or hardware-software co-design.
- Experience developing systems for performance modeling, simulation, or system analysis.
Preferred Qualifications
- Master's degree or PhD in Electrical Engineering, Computer Engineering or Computer Science, with an emphasis on computer architecture.
- Experience architecting hardware solutions or performance optimizations for large-scale ML training and inference.
- Experience with deep learning frameworks such as TensorFlow or PyTorch.
- Deep understanding of ML trends, business drivers, and the software ecosystem.
- Ability to engage and collaborate with hardware designers, software architects, and ML researchers.
As a Staff Co-Design Engineer on the TPU Architecture team, you will act as a key technical anchor bridging the gap between model architecture innovation and next-generation hardware design. Operating cross-functionally across AI research and engineering, you will help shape the architectural roadmap for our future machine learning serving and training capabilities. You will drive the integration of Machine Learning (ML) researchsuch as the training and serving of massive foundation modelswith advanced silicon architectures to deliver industry-leading, high-performance, and power-efficient accelerators.
The AI and Infrastructure team is redefining whats possible. We empower Google customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity. Our customers include Googlers, Google Cloud customers, and billions of Google users worldwide.
We're the driving force behind Google's groundbreaking innovations, empowering the development of our cutting-edge AI models, delivering unparalleled computing power to global services, and providing the essential platforms that enable developers to build the future. From software to hardware our teams are shaping the future of world-leading hyperscale computing, with key teams working on the development of our TPUs, Vertex AI for Google Cloud, Google Global Networking, Data Center operations, systems research, and much more.
Individual pay is determined by factors including job-related skills, experience, and relevant education or training.
$192000 - $279000 (USD) + 20% bonus target + bonus + equity + benefits
Learn more about benefits at Google .
Responsibilities
- Drive the definition and optimization of the hardware/software stack to enable performant training and serving of large ML models.
- Collaborate with research and modeling teams to innovate on model architectures, focusing on scaling, quality, and their direct impact on hardware performance.
- Lead the development of configurable architectural simulators and cycle-accurate performance models to quantify microarchitectural optimizations and evaluate architectural decisions.
- Conduct system-level performance analysis across highly distributed ML systems, innovating new methodologies to balance compute, memory bandwidth, and inter-chip network requirements.
- Engage with partners across hardware design, compiler development, and ML research to transition architectural innovations from concept to production.