GetGround Jobs

Lead Site Reliability Engineer

GetGround

Lead Site Reliability Engineer

Reposted 9 Hours Ago

Be an Early Applicant

Hybrid

London, Greater London, England, GBR

Senior level

Hybrid

London, Greater London, England, GBR

Senior level

Lead the design and maintenance of cloud infrastructure, focusing on scalability and reliability while mentoring engineering teams and implementing best practices.

The summary above was generated by AI

London, Waterloo (Hybrid, 4 days in-office - Wednesday is our set work from home day, though you can come in on Wednesday too if you wish)

We are disrupting one of the world's largest asset classes, property. With £2Bn+ assets on our platform and 30,000+ users across 70 countries, we're building the future of asset ownership and in doing so, are able to address wealth inequality.

Our product simplifies property investing from start to finish, making real estate investment accessible to everyone.

What you'll love doing:

Working in cross-functional product teams, taking infrastructure and reliability initiatives from concept to production.
Navigating ambiguity in a fast-moving environment where ownership and freedom are core to how we operate.
Building and maintaining robust, scalable infrastructure across our GCP cloud environment. Working with Kubernetes, Terraform, Cloudflare, and modern observability tooling to ensure our platform runs smoothly.
Collaborating closely with engineering teams to design CI/CD pipelines, improve deployment practices, and champion reliability as a core engineering principle.
Helping to define SRE practices for a high-growth fintech platform. Mentoring other engineers as we scale our teams and impact.

What you'll be doing:

Designing, implementing, and maintaining our cloud infrastructure on Google Cloud Platform (GCP), ensuring scalability, reliability, and security.
Owning our Kubernetes clusters and containerization strategy - from Docker image optimization to cluster management and deployment orchestration.
Building and evolving our Infrastructure as Code using Terraform, creating modular, testable, well-documented configurations that scale with our rapid growth.
Managing and optimizing our Cloudflare infrastructure, including Workers for edge computing, DNS, CDN, security policies, and performance optimization.
Deploy AI powered product features in isolated and secure serverless environments.
Implementing comprehensive monitoring and observability using Prometheus and Grafana, defining SLIs/SLOs, and proactively identifying issues before they impact users.
Designing and maintaining CI/CD pipelines with appropriate quality gates, testing strategies, and deployment techniques (blue-green, canary) to enable fast, safe releases.
Ensuring security best practices across our infrastructure - from network design and access controls to secrets management and vulnerability scanning.
Working with engineering teams to improve application reliability, performance, and observability through instrumentation and architectural guidance.
Enabling developer productivity through self-service tooling, clear documentation, and automation of operational tasks.

What we're looking for:

Essential:

5+ years in SRE, DevOps, or platform engineering roles with production-grade infrastructure experience
Strong hands-on experience with Google Cloud Platform (GCP)
Expert-level knowledge of Kubernetes and Docker - you've deployed, managed, and troubleshot production clusters
Proficiency in Terraform for infrastructure as code
Experience with Cloudflare services, including Workers, DNS, CDN, and security features
Experience implementing and managing observability stacks with Prometheus and Grafana
Strong understanding of CI/CD principles, pipeline design, and deployment strategies
Experience with cloud networking, security groups, VPCs, and network peering
Solid scripting skills (Shell, Python, or similar)

Desirable:

Experience with blue-green or canary deployment techniques
Familiarity with programming languages like Go or TypeScript
Background in implementing security automation and quality gates
Experience with configuration management tools
Understanding of SRE principles: SLIs, SLOs, error budgets, and blameless postmortems
Experience with edge computing and serverless architectures
Track record of mentoring engineers and fostering a culture of reliability

What we are building: The first end-to-end real estate investment offering - making the dream of owning real estate more accessible to everyone globally.

Diversity & inclusion at GetGround: We encourage applications from all sections of society and we believe in the criticality of an inclusive culture. We are committed to equal employment opportunity regardless of race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity or any other basis as protected by law.

42% of our employees identify as female or non-specified, 58% as male
22 nationalities represented across offices in 5 countries
Our work on Design Accessibility
Inclusion is at the heart of our culture - we celebrate and reflect on key D&I and cultural events such as: Black History Month, International Women's Day and Pride

For more information on how we store your candidate data, please see our recruitment privacy policy.

1 Lyric Square, Hammersmith, London, United Kingdom, W6 0NB

Similar Jobs

LSEG (London Stock Exchange Group)

Site Reliability Engineer

9 Hours Ago

In-Office

Senior level

Fintech • Analytics

The Lead Site Reliability Engineer will establish SRE foundations, collaborate on system reliability, and champion observability practices while improving operational efficiency and mentoring engineers.

Top Skills: AWSCloudFormationDatadogEc2EcsEksElkGrafanaKubernetesOpentelemetryPrometheusTerraform

NICE

Site Reliability Engineer

7 Days Ago

In-Office

Senior level

Cloud • Software • Analytics

The Lead Site Reliability Engineer will manage production environments, automate tasks, lead investigations, and enhance observability in cloud platforms. Requires extensive SRE experience and collaboration with engineering teams to enforce service reliability metrics.

Top Skills: ArmAzureAzure DevopsBicepC#ElasticsearchGitGrafanaKubernetesPowershellPrometheusPython

FNZ Group

Site Reliability Engineer

10 Days Ago

In-Office

London, Greater London, England, GBR

Senior level

Fintech • Payments • Financial Services

The Lead Site Reliability Engineer ensures high availability of FNZ platforms, implements monitoring and deployment solutions, and collaborates with engineering teams. Responsibilities include optimizing cloud workloads and managing application delivery networks.

Top Skills: AdcAWSAzureBashCdnF5 Distributed CloudGCPKubernetesNewrelicSplunkTerraform

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

GetGround

Lead Site Reliability Engineer

GetGround London, England Office

Similar Jobs

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

What you need to know about the London Tech Scene