Dcgpu Customer Solutions Firmware - Vancouver, Canada - Advanced Micro Devices, Inc

Sophia Lee

Posted by:

Sophia Lee

beBee Recruiter


Description

Overview:

WHAT YOU DO AT AMD CHANGES EVERYTHING
We care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world.

Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded.

Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world's most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.

AMD together we advance_


Responsibilities:


DCGPU Customer Solutions Firmware (FW) Enablement Leader

The Role:


As the AI-centric data center GPU landscape evolves, our focus sharpens on module, chassis, rack, and data center level management firmware.

Understanding that each customer we and our partners engage with possesses distinct requirements, we aim to scale our solutions for universal support.

To achieve this, we are introducing a key role centered around holistic DCGPU customer solutions firmware enablement.


Key Responsibilities:


  • Codevelopment of Firmware: Collaborate with internal teams and partners to tailor our FW based on unique customer needs, ensuring scalability and adaptability.
  • Partner Ecosystem Engagement: Drive and synchronize initiatives with our ecosystem partners to ensure coherence in firmware development and implementation.
  • EndtoEnd Firmware Stack Validation: Ensure that the complete firmware stack is reliable, efficient, and meets customer expectations.
  • Standards Bodies Engagements: Actively participate in, and drive standards body discussions and engagements to promote our vision and ensure our solutions remain at the forefront of industry standards.
  • Partner Enablement: Empower our partners by guaranteeing smooth integration and assured interoperability with their host system management capabilities.
  • Expert Triage &

Debug Support:
Offer expert-level support during development, execution, and customer qualification cycles, ensuring the timely resolution of any challenges.

  • Team Leadership: Lead and mentor a dedicated team, ensuring close collaboration with our system development, validation, and central engineering firmware teams.

Preferred Experience:


  • Knowledge of HPC and AI systems, especially in server/cluster management requirements such as power management, thermal solutions, node orchestration, security features, hardware accelerators integration, and system health monitoring.
  • Proven experience in firmware development, validation, and debugging.
  • Strong leadership skills with a track record of leading crossfunctional teams.
  • Excellent interpersonal and communication skills, with an ability to foster collaboration and drive consensus among various stakeholders.
  • Preferred

Qualifications:

  • Prior experience in server systems and datacenter GPU architecture
  • Prior experience leading complex FW development and QA projects
  • Familiarity with international standards bodies and industry consortiums.

Academic Credentials:


  • Bachelor's Degree (BS) in Electrical Engineering, Computer Engineering, Computer Science or other relevant field
  • Master's Degree (MS) preferred

Qualifications:

  • Benefits offered are described: _AMD benefits at a glance.

More jobs from Advanced Micro Devices, Inc