Cloudflare Logo

Cloudflare

Senior Systems Engineer, Workers AI

Reposted 9 Days Ago
Be an Early Applicant
Hybrid
London, Greater London, England, GBR
Senior level
Hybrid
London, Greater London, England, GBR
Senior level
Design and build the AI inference infrastructure for Cloudflare, optimizing systems for high availability and performance while mentoring junior engineers.
The summary above was generated by AI
Available Locations: Austin, TX or London, UK (Hybrid) About the role
You'll design and build the core infrastructure that powers AI inference across Cloudflare's global network - real-time voice, frontier open LLMs, and customer-deployed models running on a heterogeneous fleet of GPUs and next-generation accelerators in hundreds of cities worldwide. Working alongside AI/ML engineers, hardware partners, and Cloudflare product teams, you'll solve hard problems in distributed systems and high-performance computing: sub-second model cold starts, multi-accelerator workload scheduling, efficient KV cache management, and a model deployment platform serving both Cloudflare and customers bringing their own models. We're building an AI inference platform embedded in the fabric of the internet - something that doesn't exist yet - and this role puts you at the center of it. We're looking for high-agency systems engineers who are energized by foundational infrastructure problems and want to define how AI runs at the edge of the network.
Role Responsibilities
  • Develop and maintain core components of the serverless inference platform to ensure high availability and scalability for Cloudflare users.
  • Optimize the model scheduling system to significantly increase efficiency and resource utilization across our inference infrastructure.
  • Implement improvements to the inference request routing logic to enhance overall performance and reduce latency for end-users.
  • Drive significant, measurable improvements in the platform's reliability and resilience by identifying and mitigating systemic risks.
  • Expand and refine the observability stack, including metrics, logging, and tracing, and fine-tune alerts to proactively identify and resolve production issues.
  • Lead complex, cross-functional technical projects from initial concept and design through final deployment and operationalization.
  • Act as a mentor to junior engineers and actively contribute to cultivating a strong, collaborative engineering culture within the team.
Role Requirements
Must-Have Skills
  • Experience in systems engineering, with a focus on distributed, high-performance systems.
  • Expert proficiency in Rust programming, particularly in an asynchronous environment.
  • Deep understanding and hands-on experience with relevant networking and application protocols (e.g., TCP, HTTP, WebSocket).
  • Experience with scaling and performance optimization techniques, including load balancing and caching in a distributed environment.

Nice-to-Have Skills
  • Demonstrable experience with container orchestration platforms, specifically Kubernetes and/or Nomad.
  • Familiarity with the challenges and architectures involved in large-scale inference serving (e.g., LLM and diffusion models).

Top Skills

HTTP
Kubernetes
Nomad
Rust
Tcp
Websocket

Cloudflare London, England Office

Riverside Building, 6th Floor, County Hall/The, Belvedere Rd, London, United Kingdom, SE1 7PB

Similar Jobs at Cloudflare

9 Days Ago
Hybrid
London, Greater London, England, GBR
Senior level
Senior level
Cloud • Information Technology • Security • Software • Cybersecurity
Lead technical projects for the Workers AI team focused on deploying AI inference, building innovative features, and enhancing developer experience.
Top Skills: PythonPyTorchTensorFlow
Mid level
Cloud • Information Technology • Security • Software • Cybersecurity
The Escalation Engineer resolves complex customer issues, provides technical support, manages escalation lifecycles, and documents findings to improve service quality.
Top Skills: BashBrowser RenderingDatabase InteractionsDockerDom ManipulationElk StackGrafanaHTMLHttp/SJaegerJavaScriptKubernetesLinuxPHPPrometheusPythonRestful ApisSentryServer-Side ArchitectureServerless FunctionsSQLWeb Frameworks
15 Hours Ago
Hybrid
London, Greater London, England, GBR
Senior level
Senior level
Cloud • Information Technology • Security • Software • Cybersecurity
As the Senior Product Manager for Cloudflare One Appliance, you will lead the vision and execution of a managed hardware and virtual appliance, leveraging AI to enhance customer connectivity and automate workflows, while collaborating with internal teams and maintaining vendor relations.
Top Skills: AIAWSAzureBgpGCPGreIpsecOciOspfSaseSd-Wan

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account