ReflectionAI Logo

ReflectionAI

Research Program Manager - Research Infrastructure

Reposted 23 Hours Ago
Be an Early Applicant
In-Office
London, Greater London, England, GBR
Senior level
In-Office
London, Greater London, England, GBR
Senior level
The Research Program Manager will oversee cross-functional programs focused on enhancing training infrastructure, ensuring reliability, and facilitating effective communication among multiple teams in a high-paced environment.
The summary above was generated by AI
Our Mission

Reflection’s mission is to build open superintelligence and make it accessible to all.

We’re developing open weight models for individuals, agents, enterprises, and even nation states. Our team of AI researchers and company builders come from DeepMind, OpenAI, Google Brain, Meta, Character.AI, Anthropic and beyond.

About the Role

Research Program Managers at Reflection are high-leverage leaders and operators who embed directly with research and infrastructure teams to accelerate the pace of frontier model development. They are not project trackers. They are force multipliers who bring clarity to ambiguity, drive decisions when the path forward is unclear, and ensure that the work happening across multiple teams connects into a coherent whole.

This role focuses on scaling our research infrastructure to support massive, frontier-scale training runs across pre-training, mid-training, and post-training. You will work closely with teams building on training libraries like Megatron, driving the programs that turn raw clusters into reliable, high-performance training environments. Your job is to make sure the infrastructure we build works end-to-end, that teams are unblocked, and that we can scale with confidence as our ambitions grow.

You bring a first-responder mentality. When things go sideways, you don't wait to be asked. You jump in, assess the situation, cut through noise, align the people who need to be aligned, and drive resolution.

What You'll Do
  • Own cross-functional programs spanning training infrastructure and cluster reliability across pre-training, mid-training, and post-training workstreams.

  • Drive end-to-end coordination scaling our training stack alongside engineering leads and external partners.

  • Jump into active incidents and escalations to triage, coordinate response, and drive resolution across teams. Champion a culture of blameless post-mortems and continuous learning, turning every incident into a concrete improvement to our systems and processes.

  • Partner with infrastructure and research engineering leads to identify bottlenecks, define priorities, and ensure that infrastructure investments are directly tied to research velocity.

  • Build and maintain visibility into training run health, cluster reliability, and infrastructure performance so that leadership and teams have the context they need to make fast, informed decisions.

  • Create lightweight, durable processes for cross-team handoffs, config management, checkpoint workflows, and other coordination-heavy touchpoints that currently rely on ad hoc communication.

  • Translate technical complexity into clear status updates and decision frameworks for engineering leadership and executives.

About You
  • 7+ years of experience in technical program management, research operations, or infrastructure coordination, ideally in ML/AI or large-scale distributed systems environments.

  • Deep technical knowledge to engage with engineers on topics like distributed training frameworks, GPU cluster architecture, scheduler behavior, networking, and storage systems. You don't need to write the code, but you need to understand the systems to “speak the language”, i.e., to ask the right questions and identify risks early.

  • Proven ability to operate effectively in high-ambiguity, fast-moving environments. You create structure where there is none and drive clarity without waiting for permission.

  • Track record of managing complex, multi-team programs with competing priorities and hard deadlines. You know how to make tradeoffs and you communicate them clearly.

  • Strong stakeholder management skills across both deeply technical ICs and senior leadership. You build trust by being reliable, direct, and well-informed.

  • Comfortable operating in crisis mode. You stay calm under pressure, you know how to prioritize when everything is on fire, and you follow through on the other side.

  • Excited to build from zero to one. We are a small, fast-moving team and this role will help define how Research Program management Works at Reflection.

  • Motivated by enabling researchers and engineers to build the world's most capable open-weight AI systems.

What We Offer:

We believe that to build superintelligence that is truly open, you need to start at the foundation. Joining Reflection means building from the ground up as part of a small talent-dense team. You will help define our future as a company, and help define the frontier of open foundational models.

We want you to do the most impactful work of your career with the confidence that you and the people you care about most are supported.

  • Top-tier compensation: Salary and equity structured to recognize and retain the best talent globally.

  • Health & wellness: Comprehensive medical, dental, vision, life, and disability insurance.

  • Life & family: Fully paid parental leave for all new parents, including adoptive and surrogate journeys. Financial support for family planning.

  • Benefits & balance: paid time off when you need it, relocation support, and more perks that optimize your time.

  • Opportunities to connect with teammates: lunch and dinner are provided daily. We have regular off-sites and team celebrations.

Similar Jobs

An Hour Ago
Easy Apply
Remote or Hybrid
Easy Apply
Entry level
Entry level
Big Data • Cloud • Software • Database
Execute full-lifecycle recruiting and strategic passive sourcing across technical and non-technical roles. Build candidate pipelines, partner with hiring managers, conduct talent mapping and competitive research, and support inclusive hiring and candidate experience. Rapidly learn new role profiles and evolve long-term engagement strategies.
Top Skills: GemGreenhouse
An Hour Ago
Easy Apply
Remote or Hybrid
Easy Apply
Junior
Junior
Big Data • Cloud • Software • Database
The Cloud Operations Engineer ensures uptime for MongoDB Atlas customers, monitors systems, troubleshoots incidents, and collaborates with a global team. Responsibilities include automating tasks, performing root cause analysis, and coordinating on-call rotations.
Top Skills: Amazon Web ServicesAzureDnsGoGoogle Cloud PlatformJavaJavaScriptKubernetesLinuxMongoDBPythonSplunkTcp/Ip
An Hour Ago
Remote or Hybrid
Senior level
Senior level
Consumer Web • Coupons • Healthtech • Social Impact • Pharmaceutical
The Sr. Pharma Direct Sales Director will develop and grow relationships with pharmaceutical clients, assess revenue potential, collaborate on proposals, and provide sales updates while staying informed on industry trends.

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account