Thales Logo

Thales

Senior Site Reliability Engineer / Cloud Operations Engineer (m/f/d)

Posted 23 Days Ago
Be an Early Applicant
In-Office
Berlin
Senior level
In-Office
Berlin
Senior level
Operate and maintain highly available sovereign cloud services (99.99%+). Monitor SLIs/SLOs, troubleshoot complex incidents, participate in 24/7 on-call rotation, drive automation, document runbooks, perform post-incident reviews, and ensure compliance for secure cloud environments leveraging Google Cloud technologies.
The summary above was generated by AI
Location: Berlin, Germany

We Say HI* 

Site Reliability Engineer / Cloud Operations Engineer (f/m/d)  
 

 German companies and public administrations in this country are ready to accelerate their digital transformation and the use of AI—but they will never compromise on the security of their most sensitive data. This is where Thales in Germany, in partnership with Google Cloud and our new company currently being established, comes into play. With a new, 100% German business unit, we are providing a concrete response to the strict requirements of the BSI. What we are creating is a locally and fully autonomously operated “Trusted Cloud”. It provides access to the broadest service portfolio on the market, while everything remains strictly under European jurisdiction. By combining German and French standards such as SecNumCloud, C5 and C3-A, we offer our customers unequaled resilience and business continuity. This is a turning point for our industry and a decisive step towards a strong, sovereign digital Europe. 

Your mission as Site Reliability Engineer: 

  • Operate and maintain mission-critical sovereign cloud services with availability targets of 99.99% and above.

  • Monitor service health, reliability, scalability, latency, and performance using Service Level Indicators (SLIs) and Service Level Objectives (SLOs).

  • Investigate, troubleshoot, and resolve complex production incidents across large-scale distributed cloud environments.

  • Participate in a structured 24/7 on-call rotation (approximately one week every six weeks) to ensure continuous service availability.

  • Collaborate with Site Reliability Engineers, Cloud Infrastructure Specialists, and Product Experts across international teams to mitigate incidents and drive long-term solutions.

  • Build a deep understanding of Google's cloud technologies and distributed systems through an intensive training program covering technologies such as Borg, Colossus, Spanner, and other core GCP components.

  • Drive operational excellence by creating and maintaining technical documentation, standardizing incident response procedures, and continuously improving operational playbooks.

  • Lead and contribute to post-incident reviews, root cause analyses, and the implementation of preventive measures to improve platform reliability.

  • Identify opportunities for automation and contribute to improving operational efficiency, scalability, compliance, and service reliability.

  • Support the operation of highly secure cloud environments designed to meet stringent regulatory and sovereignty requirements.

We are looking forward to: 

  • Several years of experience in Site Reliability Engineering, Cloud Operations, DevOps, Platform Engineering, Infrastructure Engineering, Production Support, Network Operations (NOC), Technical Operations, or a comparable role.

  • Experience operating and supporting business-critical production systems with demanding uptime and availability requirements.

  • Strong troubleshooting and incident management skills in complex technical environments.

  • Experience monitoring, operating, and maintaining distributed systems, cloud platforms, infrastructure services, or large-scale applications.

  • Familiarity with reliability engineering concepts, observability, monitoring, alerting, incident response, and root cause analysis.

  • Experience working with automation, scripting, operational tooling, or Infrastructure-as-Code approaches.

  • Strong analytical and problem-solving skills with a structured and methodical approach.

  • Professional proficiency in both German and English.

  • Willingness to participate in a regular on-call rotation.

  • Curiosity, adaptability, and a strong desire to learn and work with hyperscale cloud technologies.

 
The Group invests more than €4,5 billion per year in Research & Development in key areas, particularly for critical environments, such as Artificial Intelligence, cybersecurity, quantum and cloud technologies.  

In 2025, the Group generated sales of €22.1 billion. 
 
For our more than 85,000 employees in 65 countries we open up visionary perspectives, realise individual career paths and enable creative freedom. This is achieved with courage, versatility and the firm intention to make the demanding challenges of our time safer and more inclusive. With our sustainable value-focused management we support diversity actively. 

Say HI* – Your journey to us 

At times of change our international teams are ready to meet the complexity of today with the industry-leading technologies of tomorrow. Will you be part of it? Your Talent Acquisition contact Andre Fuhrmann is looking forward to your online application.  

Andre Fuhrmann – Talent Acquisition Partner  

+49 7156 / 302-22002 

*Human Intelligence 

#LI-AF1 

#LI-HYBRID 

Thales Crawley, England Office

Manor Royal, Crawley, United Kingdom, RH10 9HA

Thales London, England Office

Thomas More Square 4, London, United Kingdom, E1W 1YW

Similar Jobs

11 Hours Ago
In-Office
Senior level
Senior level
Fintech • Payments • Financial Services
Lead Teya's market entry in Germany by building the go-to-market operation from scratch: select channels, implement local GTM platform, prove a regional playbook, own commercial targets and unit economics, and hire and grow the local commercial team while collaborating with central GTM, platform, and strategy teams.
11 Hours Ago
Hybrid
London, Greater London, England, GBR
Senior level
Senior level
eCommerce • Mobile • Retail
The Account Manager will drive business growth by fostering relationships with sellers, developing strategies for growth, and collaborating with cross-functional teams.
Top Skills: AirtableSalesforceSigma
11 Hours Ago
Hybrid
London, Greater London, England, GBR
Senior level
Senior level
eCommerce • Mobile • Retail
Lead development of commerce, trust & safety, fraud, and marketplace governance across Europe and UK; design clear, data-informed policies, resolve escalations, advise senior stakeholders, and align global policy with regional nuance while enabling operational scaling.
Top Skills: Ai ToolsLarge Language Models

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account