NatWest Group Logo

NatWest Group

Site Reliability Engineer

Posted 2 Days Ago
Be an Early Applicant
In-Office
Edinburgh, City of Edinburgh, Scotland
Mid level
In-Office
Edinburgh, City of Edinburgh, Scotland
Mid level
The Site Reliability Engineer will enhance cloud-native platform reliability, implement SRE practices, and support operational excellence through automation and collaboration with engineers.
The summary above was generated by AI

Join us as a Site Reliability Engineer

  • In this role, you’ll support improvements to availability, performance, efficiency, change management, monitoring, security, incident response, and capacity planning for our products and services
  • You’ll enjoy significant stakeholder interaction, working in collaboration with engineers to ensure a principled approach to delivering change in a safe and secure way
  • This is a chance to join an inclusive team with a collaborative ethos and a commitment to innovation and professional development
  • You’ll need to have the flexibility to support the team by working shifts and weekends on rotation
What you'll do

As our Site Reliability Engineer, you’ll contribute to the reliability, monitoring, and operational excellence of cloud-native platforms. You’ll work closely with senior engineers to support production systems, implement Site Reliability Engineering (SRE) practices, and ensure services are observable, scalable, and resilient. You’ll also participate in the 24/7 support and on-call rotation, gaining experience in incident response and platform operations.

In this role, we’ll expect you to be involved as well in the operation of AWS-based Kubernetes platforms (EKS) while contributing to monitoring, alerting, and observability implementations using tools like Grafana and Prometheus. You’ll also assist in incident management, troubleshooting, and root cause analysis.

In addition, you’ll be:

  • Participating in on-call rotations and production support activities
  • Implementing infrastructure changes using Terraform and GitOps workflows
  • Supporting continuous integration and continuous delivery (CI/CD) pipelines using GitLab, Argo CD, and deployment processes
  • Helping improve system reliability through automation and operational improvements
  • Following SRE practices such as runbooks, documentation, and post-incident reviews
  • Working with DevOps and engineering teams to improve system performance and stability
  • Ensuring solutions align with security, compliance, and operational standards
The skills you'll need

We’re looking for an engineer with solid foundational experience in cloud platforms and a keen interest in reliability engineering and production operations. You must have experience working with AWS and Kubernetes (EKS) in a production or pre-production environment, along with familiarity with monitoring and observability tools such as Grafana and Prometheus. To succeed in this role, you should also have a good understanding of CI/CD pipelines and Git-based workflows, with GitLab preferred.

You'll also need:

  • Exposure to Terraform or infrastructure-as-code concepts
  • A basic understanding of SRE practices and production support models
  • Experience troubleshooting applications or infrastructure issues
  • An awareness of networking and security fundamentals in cloud environments
  • A willingness to participate in on-call rotations and incident response
  • A strong problem-solving mindset and an eagerness to learn
  • Good communication and collaboration skills

Hours

35

Job Posting Closing Date:

03/06/2026

Ways of Working:Remote First

NatWest Group London, England Office

250 Bishopsgate, London, United Kingdom, EC2M 4AA

Similar Jobs

11 Days Ago
In-Office
Senior level
Senior level
Fintech • Information Technology • Financial Services
The SRE Lead will oversee resilient system design, automation, and AI solutions, enhancing reliability for BlackRock's Private Markets platform while guiding a global engineering team.
Top Skills: AIAiopsApache CassandraMlNosql DatabaseRedisRelational Database
2 Days Ago
In-Office
Senior level
Senior level
Fintech • Payments • Financial Services
The Senior Site Reliability Engineer ensures reliability and performance of production platforms, leads SRE practices, incident management, and automation using AWS and Kubernetes.
Top Skills: Argo CdAWSGitopsGrafanaKarpenterKubernetesLokiPrometheusTempoTerraform
Yesterday
In-Office
London, Greater London, England, GBR
Senior level
Senior level
Fintech • Payments • Financial Services
The Lead Site Reliability Engineer ensures high availability of FNZ platforms, implements monitoring and deployment solutions, and collaborates with engineering teams. Responsibilities include optimizing cloud workloads and managing application delivery networks.
Top Skills: AdcAWSAzureBashCdnF5 Distributed CloudGCPKubernetesNewrelicSplunkTerraform

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account