fanvue

Site Reliability Engineer

Reposted 24 Days Ago

Be an Early Applicant

Remote

Hiring Remotely in UK

Expert/Leader

Remote

Hiring Remotely in UK

Expert/Leader

The Site Reliability Engineer will enhance platform reliability, scalability, and performance, focusing on AWS infrastructure and Aurora PostgreSQL management.

The summary above was generated by AI

Join us in redefining the creator economy with AI

Fanvue is one of the fastest-growing creator monetisation platforms globally. We’re an AI-powered, creator-first platform helping creators connect, engage, and earn directly from their audiences at scale. Following our recent Series A, Fanvue has surpassed $100M+ in annual recurring revenue with triple-digit year-on-year growth, supporting hundreds of thousands of creators and millions of fans worldwide.

Reliability at Fanvue is a growth enabler. This role exists to ensure our systems are predictable, scalable, and resilient, so product teams can ship fast without compromising uptime, performance, or creator trust.

🎯 The Role

We’re hiring a Site Reliability Engineer to strengthen Fanvue’s platform reliability and infrastructure foundations. You’ll work closely with Platform and Product Engineering teams to design, operate, and evolve the systems that keep Fanvue fast, available, and safe as we scale.

This is a hands-on role focused on infrastructure, observability, automation, and operational excellence, with real ownership of production systems.

🚀 What You’ll Do

Design, build, and operate reliable infrastructure across Fanvue’s cloud environment
Own and improve observability, monitoring, and alerting for critical services
Reduce operational toil through automation, tooling, and infrastructure as code
Partner with engineering teams to improve reliability, scalability, and deployment safety
Lead incident response for infrastructure issues and drive high-quality post-incident reviews
Define and track SLOs, SLIs, and error budgets to balance reliability with delivery speed
Improve CI/CD reliability and rollout practices to reduce risk
Contribute to disaster recovery, backup, and resilience planning

👀 Who You Are

Strong experience as an SRE, infrastructure engineer, or platform engineer
Comfortable operating production systems at scale
Experience with cloud platforms and distributed systems
Strong background in observability, monitoring, and incident management
Comfortable writing automation and infrastructure code
Calm, clear communicator during incidents and escalations
High ownership mindset with a bias toward long-term reliability improvements

✨ You’ll Thrive Here If

You care deeply about system reliability and predictability
You enjoy preventing problems more than reacting to them
You like partnering with product engineers rather than acting as a gatekeeper
You’re comfortable owning on-call responsibilities
You value learning through incidents and continuous improvement

⚠️ You’ll Struggle Here If

You prefer reactive firefighting over proactive reliability work
You’re uncomfortable with operational responsibility
You avoid incident ownership or post-incident accountability
You need heavy process to operate effectively

🌍 Why Join Fanvue?

Own reliability for a $100M+ ARR platform
Enable teams to ship faster with confidence
Work on complex, real-world scaling challenges
Competitive salary and benefits package
Unlimited holiday
Remote working
Flexible hours, according to when you perform best
Budget for growth and wellbeing

⭐ Fanvue is for Everyone

We believe diverse teams build better products. Even if you do not meet every single requirement listed, we encourage you to apply. Many great people grow into parts of a role, and we value potential, mindset, and ambition just as much as experience.

Top Skills

Aurora Postgresql

AWS

Aws Cdk

Aws Cloudwatch

DynamoDB

Elasticache Redis

Rds

Typescript

London, United Kingdom

Similar Jobs

Luupli

Site Reliability Engineer

5 Days Ago

Remote

United Kingdom

Mid level

Social Media

The Site Reliability Engineer will design, build, and maintain AWS cloud infrastructure, ensure performance and reliability, automate tasks, and participate in incident management.

Top Skills: AWSBashPythonTerraform

Civica

Senior Engineering Manager

10 Days Ago

Remote

United Kingdom

Senior level

Software

Lead and evolve LiveOps teams for reliable SaaS and cloud environments, focusing on operational excellence, automation, and incident management.

Top Skills: AnsibleAWSAzureDatadogGithub ActionsGrafanaKubernetesPackerPrometheusTerraformVMware

Centrica

Site Reliability Engineer

10 Days Ago

In-Office or Remote

Entry level

Retail • Energy • Utilities

The Reliability Operations Specialist maintains system stability, monitors performance, manages incidents, collaborates with service owners, and drives automation to enhance operational efficiency.

Top Skills: Automation ToolsChange Management ProcessesIncident Management ProcessesItilMonitoring ToolsOperational Dashboards

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

fanvue

Site Reliability Engineer

Top Skills

fanvue London, England Office

Similar Jobs

Site Reliability Engineer

Senior Engineering Manager

Site Reliability Engineer

What you need to know about the London Tech Scene