Description: Must have skills = PowerEdge Rack/Tower Experience, NVIDIA certifications Nice to have skills
Is this the next step in your career Find out if you are the right candidate by reading through the complete overview below.
- PowerEdge XE server experience NVIDIA QR Switches • Deep hands-on experience with GPU deployment, configuration, and multi-node testing using NVIDIA Base Command Manager • Proficiency with benchmarking tools: HPL, STREAM, NCCL, RCCL, MxP, OSU Microbenchmarks • Red Hat certification (RHCSA/RHCE) or 7+ years of relevant RH distros experience • Experience with GenAI/HPC networking (InfiniBand and/or RoCE) • Experience working in Linux based parallel computing environments at scale • Strong customer facing and communication skills Desirable Requirements • Bachelor’s degree • NVIDIA certifications (NCA, NCE, DGX) • Experience with NVIDIA UFM, Infiniband, and SpectrumX fabrics • Exposure to hybrid cloud or GPU cloud environments • Experience with GPU observability/performance profiling tools • Code Upgrade o Perform cluster-level code upgrades as per approved versions and compatibility guidelines.
• iDRAC Management o Configuration, access validation, and health checks of iDRAC.
o Troubleshooting and lifecycle management support.
• Firmware Updates o Update server, BIOS, NIC, storage, and related firmware.
o Ensure version alignment and post-update validation.
• Redfish o Overview and usage of Redfish APIs.
o Customization and automation using Redfish for system management and monitoring. xrczosw
• BlueField o Configuration and management of BlueField DPUs.