Keyrock

SRE - Site Reliability Engineer

Reposted 20 Days Ago

In-Office or Remote

Hiring Remotely in London, Greater London, England, GBR

Senior level

In-Office or Remote

Hiring Remotely in London, Greater London, England, GBR

Senior level

The SRE will design, implement, and manage cloud infrastructure focused on reliability, security, and automation, particularly in AWS and Kubernetes environments.

The summary above was generated by AI

About Keyrock

Since our beginnings in 2017, we've grown to be a leading change-maker in the digital asset space, renowned for our partnerships and innovation.

Today, we rock with over 240 team members around the world. Our diverse team hails from 45 nationalities, with backgrounds ranging from DeFi natives to PhDs. Predominantly remote, we have hubs in London, Brussels, Singapore and Paris, and host regular online and offline hangouts to keep the crew tight.

We are trading on more than 80 exchanges, and working with a wide array of asset issuers. As a well-established market maker, our distinctive expertise led us to expand rapidly. Today, our services span market making, options trading, high-frequency trading, OTC, and DeFi trading desks.

But we’re more than a service provider. We’re an initiator. We're pioneers in adopting the Rust Development language for our algorithmic trading, and champions of its use in the industry. We support the growth of Web3 startups through our Accelerator Program. We upgrade ecosystems by injecting liquidity into promising DeFi, RWA, and NFT protocols. And we push the industry's progress with our research and governance initiatives.

At Keyrock, we're not just envisioning the future of digital assets. We're actively building it.

Mission statement

Central Infrastructure Team

CIT at Keyrock is responsible for keeping all production systems running smoothly

By leveraging cloud technologies and platform automation to design, build and maintain highly reliable and scalable systems.

We focus on continuously improving the reliability of scalable software systems in accordance with the keyrock internal architecture.

As SREs at the Central Infrastructure Team we help the Engineering team balance speed and stability at scale.

Job description

We are looking for an experienced Site Reliability Engineer to join our Central Infrastructure Team, focusing on AWS, Kubernetes, and modern DevSecOps best practices.
This role involves designing, implementing, and maintaining highly scalable and resilient cloud infrastructure to support our trading operations.

The ideal candidate will have a strong background in cloud computing, automation, and CI/CD pipelines, ensuring high availability and performance for mission-critical systems.

Key Responsibilities

Cloud Infrastructure Management: Design, deploy, and maintain scalable and resilient infrastructure on AWS using Infrastructure-as-Code (IaC).
Kubernetes Administration: Manage and optimize Kubernetes clusters for containerized applications, ensuring high availability and security.
Automation & CI/CD: Implement and manage CI/CD pipelines for efficient deployment, testing, and monitoring of applications.
Observability & Monitoring: Develop comprehensive monitoring solutions using Prometheus, Grafana, LGTM stack, or similar tools to improve system reliability.
Security & Compliance: Apply best practices for cloud security, IAM policies, and compliance frameworks (SOC2, ISO 27001, etc.).
Incident Response & Performance Optimization: Troubleshoot issues, perform root cause analysis, and implement fixes to optimize performance.
Infrastructure as Code (IaC): Utilize Terraform, Ansible, or similar tools to automate infrastructure provisioning and configuration management.
Collaboration & Knowledge Sharing: Work closely with software engineering, architecture and security teams to promote DevOps culture and best practices.
Disaster Recovery & Reliability Engineering: Design failover and backup strategies to ensure business continuity in the event of failures.

Background and experience

Bachelor’s degree in Computer Science, Engineering, or a related field.
5+ years of experience in cloud infrastructure, SRE, or DevOps roles.
Interest in or any exposure to trading or similar themes would be desirable (not essential)
AWS Certified SysOps Administrator - Associate: desired.

Competences and personality

Strong expertise in AWS (EC2, S3, Lambda, RDS, VPC, IAM, etc.).
Hands-on experience with Kubernetes (EKS, K3s, or self-managed clusters).
Proficiency in scripting and automation using Python, Bash, or similar.
Experience with Infrastructure as Code (Terraform, CloudFormation, or Ansible).
Familiarity with monitoring, logging, and observability tools (Prometheus, Grafana, Datadog, etc.).
Strong understanding of networking concepts (VPC, Load Balancers, DNS, Firewalls).
Experience with DevOps methodologies, CI/CD pipelines, and GitOps practices.
Experience with high-performance and low-latency (sub millisecond) systems.
Familiarity with serverless architectures and event-driven computing.
Familiarity with Rust compilation processes and techniques.
Willing to collaborate and communicate asynchronously.
Actively look for opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation
Team spirit, ownership, critical thinking
Exposure to cloud cost optimization and FinOps strategies.
Previous exposure working with Crypto, Traditional Finance (Trad Fi) or Trading would be highly desirable but not essential

Our recruitment philosophy

We value self-awareness and powerful communication skills in our recruitment process. We seek fiercely passionate people who understand themselves and their career goals. We're after those with the right skills and a conscious choice to join our field. The perfect fit? A crypto enthusiast who’s driven, collaborative, acts with ownership and delivers solid, scalable outcomes.

Our offer

A competitive salary package, with various benefits depending on method of engagement (Employee vs Contractor)

Autonomy in your time management thanks to flexible working hours and the opportunity to work remotely

The freedom to create your own entrepreneurial experience by being part of a team of people in search of excellence
Continuing Professional Development plan with learning and certification path in accordance with both the team objectives and areas of interests

As an employer we are committed to building a positive and collaborative work environment. We welcome employees of all backgrounds, and hire, reward and promote entirely based on merit and performance.

Due to the nature of our business and external requirements, we perform background checks on all potential employees, passing which is a prerequisite to join Keyrock.

https://keyrock.com/careers/

Top Skills

Ansible

AWS

Bash

Grafana

Kubernetes

Prometheus

Python

Terraform

70 Gracechurch Street, London, United Kingdom, EC3V 0XL

Similar Jobs

Kraken Digital Asset Exchange

Site Reliability Engineer

2 Days Ago

Remote

United Kingdom

Mid level

Blockchain • Financial Services • Cryptocurrency • Web3

The SRE/DevOps Engineer will build and support infrastructure, standardize processes, guide engineers on SRE topics, and partner with development teams to eliminate friction.

Top Skills: BashDockerGitGrafanaLinuxPrometheusPythonRustTerraform

Nebius

Senior Site Reliability Engineer

11 Days Ago

In-Office or Remote

Senior level

Artificial Intelligence • Information Technology • Consulting

As a Senior Site Reliability Engineer, you will enhance the reliability and performance of our inference platform, leveraging Kubernetes and Terraform while ensuring smooth scalability of systems under load.

Top Skills: BashGrafanaKubernetesMlopsPrometheusPythonRayTerraformTritonVllm

Nebius

Senior Site Reliability Engineer

15 Days Ago

In-Office or Remote

United Kingdom

Senior level

Artificial Intelligence • Information Technology • Consulting

The Senior Site Reliability Engineer ensures system fault-tolerance, scalability, and operational continuity by leveraging cloud technologies and improving CI/CD processes.

Top Skills: AnsibleC++DockerGoHelmK8SPythonSaltTerraformUnix

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Keyrock

SRE - Site Reliability Engineer

Top Skills

Keyrock London, England Office

Similar Jobs

Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

What you need to know about the London Tech Scene