NVIDIA Logo

NVIDIA

Deep Learning Solutions Architect – Large Scale Inference Optimization

Sorry, this job was removed at 06:09 p.m. (GMT) on Thursday, Oct 09, 2025
Be an Early Applicant
Remote
6 Locations
Remote
6 Locations

Similar Jobs

An Hour Ago
Remote or Hybrid
London, England, GBR
Mid level
Mid level
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
The role involves selling SailPoint's Agentic Technology solutions to enterprise accounts, ensuring customer satisfaction, and executing effective sales strategies.
Top Skills: Agent Identity SecurityCloud Data PlatformsCloud TechnologiesData Access SecurityIaasMachine Identity Security
An Hour Ago
Remote or Hybrid
United Kingdom
Senior level
Senior level
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
Lead a UK-based remote engineering team to design and build capabilities for the Atlas platform, focusing on team management, project planning, and coaching engineers.
Top Skills: Go
An Hour Ago
Remote or Hybrid
London, England, GBR
Mid level
Mid level
Artificial Intelligence • Cloud • Sales • Security • Software • Cybersecurity • Data Privacy
The role involves selling SailPoint's Agentic technology solutions to major accounts, requiring deep relationship building, strategic planning, and consultative selling skills in cybersecurity and data governance.
Top Skills: Cloud TechnologiesCybersecurityData Access SecurityIdentity SecuritySalesforce

NVIDIA’s Worldwide Field Operations (WWFO) team is seeking a Solution Architect with a strong focus on Deep Learning and deep understanding of neural network inference. With the introduction of NVIDIA Grace CPUs and Grace-Hopper / Grace-Blackwell systems, the CPU has become more tightly integrated into the AI platform than ever before. Innovations such as Chip-to-Chip NVLINK and the significant expansion of the NVLINK domain have enabled a wide range of new neural network architectures and approaches to training and inference.

The ideal candidate will be proficient using tools such as TRT-LLM, vLLM, SGLang or similar, and have strong systems knowledge, enabling customers to fully use the capabilities of the new GB200 NVL72 systems (for example help customers embrace disaggregated inference, work on efficient KV cache offloading or help with inference of new architectures like hybrid or diffusion models). Solutions Architects work with the most exciting computing hardware and software, driving the latest breakthroughs in artificial intelligence! We need individuals who can enable customer productivity and develop lasting relationships with our technology partners, making NVIDIA an integral part of end-user solutions. We are looking for someone always passionate about artificial intelligence, someone who can maintain understanding of a fast paced field, someone able to coordinate efforts between corporate marketing, industry business development and engineering.

What you will be doing:

  • Work directly with key customers to understand their technology and provide the best AI solutions.

  • Perform in-depth analysis and optimization to ensure the best performance on GPU architecture systems (in particular Grace/ARM based systems). This includes support in optimization of large scale inference pipelines.

  • Partner with Engineering, Product and Sales teams to develop, plan best suitable solutions for customers. Enable development and growth of product features through customer feedback and proof-of-concept evaluations.

What we need to see:

  • Excellent verbal, written communication, and technical presentation skills in English.

  • MS/PhD or equivalent experience in Computer Science, Data Science, Electrical/Computer Engineering, Physics, Mathematics, other Engineering fields.

  • 5+ years work or research experience with Python/ C++ / other software development

  • Work experience and knowledge of modern NLP including good understanding of transformer, state space, diffusion, MOE model architectures. This can include either expertise in training or optimization/compression/operation of DNNs.

  • Understanding of key libraries used for NLP/LLM training (such as Megatron-LN, NeMo, DeepSpeed etc.) and/or deployment (e.g. TensorRT-LLM, vLLM, Triton Inference Server).

  • Person excited to work with multiple levels and teams across organizations (Engineering, Product, Sales and Marketing team). Capable of working in a constantly evolving environment without losing focus.

  • Self-starter with demeanor for growth, passion for continuous learning and sharing findings across the team.

Ways to Stand Out from The Crowd:

  • Experience running/debugging large scale distributed DL training or inference.

  • Proven track record in optimizing neural network training performance and robustness, including implementing asynchronous checkpointing and optimizing CUDA kernels.

  • Understanding of HPC systems: data center design, high speed interconnect InfiniBand, Cluster Storage and Scheduling related design and/or management experience.

Widely considered to be one of the technology world’s most desirable employers, NVIDIA offers highly competitive salaries and a comprehensive benefits package. As you plan your future, see what we can offer to you and your family www.nvidiabenefits.com/

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

NVIDIA London, England Office

13th Floor One Angel Court, London, United Kingdom, EC2R 7HJ

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account