Deepmind Logo

Deepmind

Software Engineer, Gemini Deployment, London

Posted Yesterday
Be an Early Applicant
Easy Apply
In-Office
London, Greater London, England
Mid level
Easy Apply
In-Office
London, Greater London, England
Mid level
As a Software Engineer, you will optimize and deploy multimodal large language models, collaborate with researchers, and improve model serving infrastructure at Google DeepMind.
The summary above was generated by AI
Snapshot

We are searching for a talented engineer passionate about bridging the gap between research and production for cutting-edge AI models. In this role, you'll play a key part in accelerating the hyperscale, low-latency deployment of Google DeepMind's Gemini models, across text, audio, image and video, and onto various product surfaces within Google. Your work will involve productionizing, optimizing and serving models, collaborating with research teams to ensure models are production-ready, and identifying ways to streamline the entire research-to-production process. This is a unique opportunity to directly impact the speed and efficiency with which Google delivers innovative AI-powered products and features to users.

About Us

Artificial Intelligence could be one of humanity’s most useful inventions. At Google DeepMind, we’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority.

The Role

In this role, you will be at the forefront of bringing cutting-edge AI research to life.  You'll work directly with researchers and engineers to optimize and deploy multimodal large language models (LLMs) like Gemini onto Google's serving infrastructure, impacting users across a diverse range of applications and accelerating research projects. This involves a blend of technical expertise and collaborative problem-solving to ensure both efficiency and quality throughout the entire LLM development and deployment lifecycle.

Key responsibilities:
  • Bridge the infrastructure gap between research and production: Collaborate closely with research teams to understand next generation modeling approaches, ensuring they are designed and implemented with production considerations in mind.
  • Optimize the serving environment: Contribute to and collaborate with other infrastructure teams to deliver serving infrastructure that is designed for maximum efficiency and performance, addressing bottlenecks in speed and scale.
  • Design and implement novel, high-performance serving techniques: e.g., continuous batching, speculative decoding, request-level scheduling, for maximum throughput and efficiency.
  • Streamline the deployment process: Identify opportunities to automate tasks, eliminate redundancies, and improve the overall velocity of model releases.
  • Develop expertise in model serving technologies:  Gain a deep understanding of serving frameworks, preprocessing pipelines, caching mechanisms, and other relevant technologies. 
  • Identify the best hardware setup for deploying a diverse set of models: Conduct deep performance profiling and improve efficiency of ML model serving on hardware accelerators.
  • Stay informed on industry trends: Continuously learn about new technologies and best practices in the field of AI research and deployment.

This is a chance to make a real difference in the way Google develops and deploys AI, directly impacting the speed and effectiveness with which we deliver innovative solutions to users.

About You

In order to set you up for success as a Software Engineer at Google DeepMind,  we look for the following skills and experience:

  • Interpersonal skills, such as discussing technical ideas effectively with colleagues and collaborating with other roles
  • Excellent knowledge of either C++ or Python
  • Experience with deployment in production environments
  • Experience with developing serving infrastructure
  • Familiarity or experience with optimisation of distributed ML systems
  • Familiarity with modern HW accelerators (GPU / TPU)

Deadline - 12th December 2025

At Google DeepMind, we value diversity of experience, knowledge, backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunity regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.


  

Top Skills

C++
Gpu
Python
Tpu

Similar Jobs

4 Hours Ago
In-Office
London, Greater London, England, GBR
Mid level
Mid level
Fintech • Legal Tech • Software • Financial Services • Cybersecurity • Data Privacy
The Escrow Business Compliance Analyst manages client onboarding for escrow deals, ensures compliance with KYC regulations, and oversees transaction setup and documentation.
6 Hours Ago
Hybrid
London, Greater London, England, GBR
Mid level
Mid level
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
The role involves developing microservices primarily in Golang, maintaining code quality, deploying applications, and collaborating with team members in a hybrid work environment.
Top Skills: Ci/CdGoHelmK8SPythonSQL
6 Hours Ago
Hybrid
London, Greater London, England, GBR
Mid level
Mid level
Blockchain • Fintech • Payments • Consulting • Cryptocurrency • Cybersecurity • Quantum Computing
The Product Innovation Manager will lead development of new payment products, engage in idea generation, and partner with teams for market testing and validation.
Top Skills: Business Model InnovationData-Driven TechnologiesPayments Acceptance EcosystemStablecoin

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account