Writer Logo

Writer

Site reliability engineer (UK)

Posted 8 Days Ago
Be an Early Applicant
London, Greater London, England
Senior level
London, Greater London, England
Senior level
The Site Reliability Engineer will lead the design, implementation, and maintenance of Writer’s cloud infrastructure, ensuring high availability and performance. Responsibilities include cloud automation, infrastructure provisioning using Terraform and Python, optimizing resources, maintaining monitoring systems, and providing mentorship to junior engineers.
The summary above was generated by AI

✍🏽 About Writer

Writer is the full-stack generative AI platform delivering transformative ROI for the world’s leading enterprises. Named one of the top 50 companies in AI by Forbes and one of the best places to work by Inc. Magazine, Writer empowers hundreds of customers like Accenture, Intuit, L’Oreal, Mars, Salesforce, and Vanguard to transform the way they work.

Writer’s fully integrated solution makes it easy to deploy secure and reliable AI applications and agents that solve mission-critical business challenges. Our suite of development tools is powered by Palmyra – Writer’s state-of-the-art family of LLMs — alongside our industry-leading graph-based RAG and customizable AI guardrails.

Founded in 2020 with office hubs in San Francisco, New York City, Austin, Chicago, and London, our team of over 250 employees thinks big and moves fast, and we’re looking for smart, hardworking builders and scalers to join us on our journey to create a better future of work.

📐 About this role 

We are looking for a foundational member of the Cloud Infrastructure team at Writer. This role will involve contributing to the development and implementation of our Site Reliability Engineering (SRE) program. The ideal candidate will ensure the reliability, scalability, performance, and security of Writer’s critical systems, taking a proactive approach to guarantee that our high-ROI products reach our customers seamlessly.
🦸🏻‍♀️ Your responsibilities:

  • Lead the design, implementation, and maintenance of Writer, Inc.’s cloud infrastructure to ensure high availability and performance

  • Design and implement scalable cloud automation to support seamless deployment for our largest enterprise customers

  • Automate infrastructure provisioning and management using Terraform & Python

  • Collaborate with development teams to optimize cloud resources and enhance system reliability

  • Develop and maintain monitoring and alerting systems to proactively identify and resolve issues affecting the reliability of our writing solutions

  • Conduct post-mortem analyses of system failures to identify root causes and implement preventive measures

  • Optimize and scale our cloud infrastructure to support growing user demand and ensure cost efficiency

  • Ensure the security and compliance of our systems, adhering to industry standards and regulations

  • Provide mentorship and technical guidance to junior engineers, fostering a culture of reliability and continuous improvement

  • Stay current with emerging technologies and industry trends to continuously improve our site reliability practices

⭐ Is this you? 

  • Proven expertise in Site Reliability Engineering with a minimum of 7 years of hands-on experience

  • Deep understanding of system architecture and infrastructure design to ensure high availability and performance

  • Bachelor’s degree in Computer Science, Engineering, or a related technical field

  • Strong proficiency in programming languages such as Python, Java, Go for automation and monitoring

  • Experience with cloud platforms like AWS, Azure, or GCP, and their respective services for scalable and resilient systems

  • Expertise in containerization technologies (e.g., Docker, Kubernetes) and orchestration tools

  • Knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack) to maintain system health and performance

  • Ability to lead and mentor junior engineers in best practices for reliability and system optimization

  • Excellent communication skills to collaborate effectively with cross-functional teams and stakeholders

  • Proactive approach to identifying and mitigating potential system failures and performance bottlenecks

  • Preferred Skills & Experience:

    • Software engineering expertise

    • Terraform

    • Python

    • Kubernetes

    • Scala

    • AWS/GCP

Curious to learn more about who we are and how we operate? Visit us here

🍩 Benefits & perks

  • Generous PTO, plus company holidays

  • Medical, dental, and vision coverage for you and your family

  • Paid parental leave for all parents (12 weeks)

  • Fertility and family planning support

  • Early-detection cancer testing through Galleri

  • Flexible spending account and dependent FSA options

  • Health savings account for eligible plans with company contribution

  • Annual work-life stipends for:

    • Home office setup, cell phone, internet

    • Wellness stipend for gym, massage/chiropractor, personal training, etc.

    • Learning and development stipend

  • Company-wide off-sites and team off-sites

  • Competitive compensation, company stock options and 401k

Writer is an equal-opportunity employer and is committed to diversity. We don't make hiring or employment decisions based on race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other basis protected by applicable local, state or federal law. Under the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

By submitting your application on the application page, you acknowledge and agree to Writer's Global Candidate Privacy Notice.

Top Skills

Go
Java
Python

Similar Jobs

2 Days Ago
Easy Apply
Hybrid
London, England, GBR
Easy Apply
Senior level
Senior level
Cloud • Software
The Senior Site Reliability Engineer focuses on enhancing the observability of the ThousandEyes platform by implementing cloud-native monitoring tools, maintaining an alerting pipeline, and contributing to a robust incident response system. They are responsible for designing, deploying, and maintaining monitoring services that ensure proactive detection of issues across cloud environments.
Top Skills: GoPython
2 Days Ago
Easy Apply
Hybrid
London, Greater London, England, GBR
Easy Apply
Senior level
Senior level
Artificial Intelligence • Cloud • Information Technology • Machine Learning • Software
The Senior Site Reliability Engineer will maintain uptime, implement resilient applications, deploy production apps, monitor performance, ensure security, automate disaster recovery, and drive operational improvements. Responsibilities also include collaborating with engineers on architectural changes and participating in recruitment efforts.
Top Skills: Amazon Web ServicesCi/CdDevOpsDockerKubernetesLinuxPrometheusPythonRestfulSreTerraform
Yesterday
London, Greater London, England, GBR
Entry level
Entry level
Information Technology • Software • Financial Services • Big Data Analytics
As a Site Reliability Engineer at Citadel, you will ensure the reliability and performance of applications, automate repetitive tasks, and propose engineering solutions for complex issues. You will work collaboratively with other teams, promote the SRE mindset, and drive improvements in application support and operational efficiency.
Top Skills: Python

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account