Elastic Logo

Elastic

Senior Site Reliability Engineer (FinOps) - Platform

Posted 13 Hours Ago
Be an Early Applicant
Remote
Hiring Remotely in Spain
Senior level
Remote
Hiring Remotely in Spain
Senior level
The Senior Site Reliability Engineer will automate system engineering, grow platform infrastructure, improve reliability, and respond to major incidents using a collaborative approach.
The summary above was generated by AI

Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale — unleashing the potential of businesses and people. The Elastic Search AI Platform, used by more than 50% of the Fortune 500, brings together the precision of search and the intelligence of AI to enable everyone to accelerate the results that matter. By taking advantage of all structured and unstructured data — securing and protecting private information more effectively — Elastic’s complete, cloud-based solutions for search, security, and observability help organizations deliver on the promise of AI.

What is The Role:

As part of the Platform Engineering department, the SRE team is designing, building, scaling and maturing the multi-cloud platform for hosting internal and external services such as the Elastic Cloud Hosted and Serverless. We develop and extend new software and tools that support the rest of the infrastructure, so that we can rapidly deploy products from all corners of Elastic. We want your experience and recommendations to offer a truly exceptional customer experience!

What you will be doing:
  • Taking an engineering approach in leading technical initiatives for automating system engineering efforts to guarantee the reliability of the global Elastic infrastructure. .
  • Growing our global Platform infrastructure to meet the increasing scaling demands by developing and maintaining software, tooling and automations.
  • Using an inclusive approach at championing an environment focused on collaboration, operational excellence, and uplifting others.
  • Responding to and preventing repeated customer impact in response to major incidents and prioritised problem management. Our on call rotation uses follow-the-sun model where everyone participates in it in (mostly) their working hours.
What you bring:
  • Success and lessons of experiences from striving for 'progress not perfection' in the name of Platform reliability. We want to hear about your customer first approach in solving operational problems with a SRE perspective.
  • A background in software engineering to collaborate with engineers to expertly identify, implement and deliver solutions. An experience in public cloud and managed Kubernetes services is advantageous.
  • Passion for developing solutions that involve inclusive communication methods to grow and strengthen partner and team relationships. Examples of working in distributed teams or working remotely is desirable.
Bonus Points:

You don't need to have all of these items, but these represent the types of work you will do as a Site Reliability Engineer at Elastic.

  • You have operated a SaaS product in a public cloud ideally built using Infrastructure-as-Code tooling such as Crossplane or Terraform
  • You have built or operated a Kubernetes-at-scale infrastructure, ideally across multiple cloud providers, and the vital automation to support it.
  • You have written non-trivial programs in Golang or other programming languages.
  • You have worked with containerized services (such as Docker.)
  • You have proven experience in leading and improving alerting and major incident management standard processes metrics systems (e.g. Elastic Stack, Graphite, Prometheus, Influx) to diagnose issues and quantify impacts to present to others at varying level of the organization.
  • You have experience in system administration with professional skills in Linux on distributed systems at scale.
  • You have diagnosed or designed, implemented and created solutions with the Elastic Stack.
  • You are experienced in thriving in a self-organizing and sharing in a globally distributed team environment.
  • You strengthen team members in bringing out the best of each other by uplifting others with coaching and mentoring.
Additional Information - We Take Care of Our People

As a distributed company, diversity drives our identity. Whether you’re looking to launch a new career or grow an existing one, Elastic is the type of company where you can balance great work with great life. Your age is only a number. It doesn’t matter if you’re just out of college or your children are; we need you for what you can do.

We strive to have parity of benefits across regions and while regulations differ from place to place, we believe taking care of our people is the right thing to do.

  • Competitive pay based on the work you do here and not your previous salary
  • Health coverage for you and your family in many locations
  • Ability to craft your calendar with flexible locations and schedules for many roles
  • Generous number of vacation days each year
  • Increase your impact - We match up to $2000 (or local currency equivalent) for financial donations and service
  • Up to 40 hours each year to use toward volunteer projects you love
  • Embracing parenthood with minimum of 16 weeks of parental leave

Different people approach problems differently. We need that. Elastic is an equal opportunity employer and is committed to creating an inclusive culture that celebrates different perspectives, experiences, and backgrounds. Qualified applicants will receive consideration for employment without regard to race, ethnicity, color, religion, sex, pregnancy, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, disability status, or any other basis protected by federal, state or local law, ordinance or regulation.

We welcome individuals with disabilities and strive to create an accessible and inclusive experience for all individuals. To request an accommodation during the application or the recruiting process, please email [email protected]. We will reply to your request within 24 business hours of submission.

Applicants have rights under Federal Employment Laws, view posters linked below: Family and Medical Leave Act (FMLA) Poster; Pay Transparency Nondiscrimination Provision Poster; Employee Polygraph Protection Act (EPPA) Poster and Know Your Rights (Poster)

Elasticsearch develops and distributes technology and information that is subject to U.S. and other countries’ export controls and licensing requirements for individuals who are located in or are nationals of the following sanctioned countries and regions: Belarus, Cuba, Iran, North Korea, Syria, or Russia, including the Ukrainian territories annexed by Russia (The Crimea region of Ukraine, The Donetsk People's Republic (DNR), The Luhansk People's Republic (LNR), Kherson or Zaporizhzhia). If you are located in or are a national of one of the listed countries or regions, an export license may be required as a condition of your employment in this role. Please note that national origin and/or nationality do not affect eligibility for employment with Elastic.

Please see here for our Privacy Statement.

Top Skills

Docker
Elastic Stack
Go
Influx
Kubernetes
Linux
Prometheus
Terraform

Elastic London, England Office

5 Southampton Street, , England, London, United Kingdom

Similar Jobs

17 Hours Ago
In-Office or Remote
Senior level
Senior level
Travel
Seeking a Senior Site Reliability Engineer to optimize infrastructure costs, enhance visibility into expenses, automate processes, and provide incident support in a remote environment.
Top Skills: BashContainersDatadogGCPHelmIstioKubernetesKustomizePythonSQL
An Hour Ago
Easy Apply
Remote or Hybrid
Easy Apply
Senior level
Senior level
Cloud • Information Technology • Security • Software • Cybersecurity
The role involves managing territory, driving revenue growth through new accounts, and collaborating with various teams to secure cloud solutions.
Top Skills: CybersecuritySaaS
An Hour Ago
Easy Apply
Remote
Easy Apply
Senior level
Senior level
Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
The Senior Site Reliability Engineer will manage system incidents, improve monitoring and logging, optimize database infrastructure, and collaborate on scaling systems efficiently.
Top Skills: AWSClickhouseKubernetesMySQLPostgresRedis

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account