Roku Logo

Roku

Senior Software Engineer - Cloud Infrastructure & Observability

Posted An Hour Ago
Be an Early Applicant
In-Office
Cambridge, Cambridgeshire, England
Senior level
In-Office
Cambridge, Cambridgeshire, England
Senior level
Lead the architecture and evolution of Roku's observability and cloud infrastructure stack, ensuring high performance and reliability for multi-cloud services.
The summary above was generated by AI
Teamwork makes the stream work.
Roku is changing how the world watches TV

Roku is the #1 TV streaming platform in the U.S., Canada, and Mexico, and we've set our sights on powering every television in the world. Roku pioneered streaming to the TV. Our mission is to be the TV streaming platform that connects the entire TV ecosystem. We connect consumers to the content they love, enable content publishers to build and monetize large audiences, and provide advertisers unique capabilities to engage consumers.

From your first day at Roku, you'll make a valuable - and valued - contribution. We're a fast-growing public company where no one is a bystander. We offer you the opportunity to delight millions of TV streamers around the world while gaining meaningful experience across a variety of disciplines.


About the Role 

We are building a next-generation observability and cloud platform that is high-performance, cost-efficient, secure, and scalable across multi-region, multi-cloud clusters. You will lead the architecture and evolution of Roku’s observability and cloud infrastructure stack. This includes metrics, logs, traces, telemetry pipelines, service mesh, developer experience, and reliability of systems that power thousands of services and millions of devices. 

You will drive a vision where developers gain deep visibility with minimal overhead, onboarding is seamless, and insights are available in real time. Your work will directly help Roku scale efficiently while maintaining reliability, cost control, and performance. 


What You’ll Be Doing 
  • Architect and lead Roku’s observability platform across metrics, logs, and traces; evolve data pipelines and storage layers optimized for high throughput, performance, and cost at Roku scale (TSDBs, Parquet, distributed processing). 
  • Extend and harden open‑source observability systems; overhaul core components (e.g., storage layers, query paths) to improve performance, reliability, and usability at scale. 
  • Implement features such as pre‑aggregation, down-sampling, and sampling to reduce load and accelerate queries across the platform. 
  • Collaborate across platform, SRE, and product teams to migrate hundreds of workloads to our common platform; augment and automate CI/CD flows and onboarding. 
  • Integrate security into infrastructure and platform services; ensure robust multi‑tenant, multi‑cluster, and multi‑cloud designs. 
  • Contribute improvements back to open source and CNCF‑aligned projects; shape standards adoption (OpenTelemetry, OpenMetrics) across the company. 
  • Mentor engineers; establish best practices for reliability, efficiency, and cost management across service mesh and observability domains. 

What You'll have  
  • Extensive experience with  software engineering with a track record of architecting distributed systems or platforms at scale. 
  • Strong hands‑on experience in Golang and one scripting language (e.g., Python or Shell). 
  • Experience operating observability at pb-scale ingestion and hundreds of millions of series. 
  • Expertise in observability platforms and tooling (Prometheus, Grafana, Loki, Tempo, ELK/OpenSearch, ClickHouse) and standards (OpenTelemetry, OpenMetrics).  
  • Deep experience building systems of scale and operating cloud infrastructure with Kubernetes; strong proficiency with service mesh technologies (Istio/Envoy), infrastructure‑as‑code (Terraform) and experience in multi‑cloud (AWS, GCP) 
  • Demonstrated ability to evolve storage and query architectures for cost, scale, and latency (e.g., TSDB, Parquet, distributed processing). 
  • Proven experience integrating security as part of infrastructure and platform development.   
  • Exceptional cross‑functional communication; effective collaboration with both technical and non‑technical stakeholders. 
  • Culture fit: independent thinker, pragmatic problem‑solver, low‑ego collaborator who moves fast and focuses on company success. 
  • Experience integrating AI tools to improve processes and reduce toil. 
  • Open‑source contributions in CNCF projects is preferred but not mandatory.  
#LI-PA1

Our Hybrid Work Approach

Roku fosters an inclusive and collaborative environment where teams work in the office Monday through Thursday. Fridays are flexible for remote work except for employees whose roles are required to be in the office five days a week or employees who are in offices with a five day in office policy.


Benefits

Roku is committed to offering a diverse range of benefits as part of our compensation package to support our employees and their families. Our comprehensive benefits include global access to mental health and financial wellness support and resources. Local benefits include statutory and voluntary benefits which may include healthcare (medical, dental, and vision), life, accident, disability, commuter, and retirement options (401(k)/pension). Our employees can take time off work for vacation and other personal reasons to balance their evolving work and life needs. It's important to note that not every benefit is available in all locations or for every role. For details specific to your location, please consult with your recruiter.


Accommodations

Roku welcomes applicants of all backgrounds and provides reasonable accommodations and adjustments in accordance with applicable law. If you require reasonable accommodation at any point in the hiring process, please direct your inquiries to [email protected].


The Roku Culture

Roku is a great place for people who want to work in a fast-paced environment where everyone is focused on the company's success rather than their own. We try to surround ourselves with people who are great at their jobs, who are easy to work with, and who keep their egos in check. We appreciate a sense of humor. We believe a fewer number of very talented folks can do more for less cost than a larger number of less talented teams. We're independent thinkers with big ideas who act boldly, move fast and accomplish extraordinary things through collaboration and trust. In short, at Roku you'll be part of a company that's changing how the world watches TV. 

We have a unique culture that we are proud of. We think of ourselves primarily as problem-solvers, which itself is a two-part idea. We come up with the solution, but the solution isn't real until it is built and delivered to the customer. That penchant for action gives us a pragmatic approach to innovation, one that has served us well since 2002. 

To learn more about Roku, our global footprint, and how we've grown, visit https://www.weareroku.com/factsheet.

By providing your information, you acknowledge that you want Roku to contact you about job roles, that you have read Roku's Applicant Privacy Notice, and understand that Roku will use your information as described in that notice. If you do not wish to receive any communications from Roku regarding this role or similar roles in the future, you may unsubscribe at any time by emailing [email protected].

Top Skills

AWS
Clickhouse
Elk
GCP
Go
Grafana
Kubernetes
Loki
Openmetrics
Opensearch
Opentelemetry
Prometheus
Python
Shell
Tempo
Terraform

Similar Jobs

An Hour Ago
In-Office
London, Greater London, England, GBR
Mid level
Mid level
Aerospace • Artificial Intelligence • Hardware • Robotics • Security • Software • Defense
As a Test & Evaluation Manager, oversee regulatory compliance, manage flight operations, and coordinate with teams to facilitate test activities across UK and Europe, requiring significant knowledge of airspace regulations.
Top Skills: AIArdupilotComputer VisionLattice OsPixhawkSensor Fusion
An Hour Ago
Hybrid
London, Greater London, England, GBR
Internship
Internship
Cloud • Information Technology • Security • Software • Cybersecurity
The intern will assist in logistics for webinars and trade shows, support campaign development, manage marketing assets, track leads, and contribute to campaign analysis.
Top Skills: Google WorkspaceMS Office
An Hour Ago
Hybrid
Mid level
Mid level
Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
The Product Manager will own the Insights solution suite, drive revenue growth, create product strategies, and manage product development processes and stakeholder engagement.

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account