First American Title Logo

First American Title

Site Reliability Engineer

Job Posted 8 Days Ago Posted 8 Days Ago
Be an Early Applicant
Santa Ana, CA
Mid level
Santa Ana, CA
Mid level
As a Site Reliability Engineer, you'll enhance application reliability, design resilient infrastructure, automate processes, and improve observability across platforms.
The summary above was generated by AI

Who We AreJoin a team that puts its People First! Since 1889, First American (NYSE: FAF) has held an unwavering belief in its people. They are passionate about what they do, and we are equally passionate about fostering an environment where all feel welcome, supported, and empowered to be innovative and reach their full potential. Our inclusive, people-first culture has earned our company numerous accolades, including being named to the Fortune 100 Best Companies to Work For® list for nine consecutive years. We have also earned awards as a best place to work for women, diversity and LGBTQ+ employees, and have been included on more than 50 regional best places to work lists. First American will always strive to be a great place to work, for all. For more information, please visit www.careers.firstam.com.

What We Do** Remote Work Welcome**
Be part of a transformative team that is shaping the way First American builds and delivers world-class technology products that fuel the real estate industry. We are looking for the best-of-the-best technology experts who will envision, design, build, and deliver innovative solutions that provide exceptional experiences and lasting value to our customers.
We’re looking for a Site Reliability Engineer who is passionate about building reliable, scalable, and observable systems. Our mission is to eliminate operational toil, automate repetitive tasks, and embed reliability into everything we do. You'll play a critical role in driving availability, performance, and operational excellence across our platforms.
This role is ideal for problem-solvers who thrive in collaborative environments and are eager to grow.

What You'll Do 

  • Support and enhance the reliability of in-house applications and systems across production and non-production environments. 

  • Design and implement resilient infrastructure and automate operational processes using IaC tools and pipelines. 

  • Improve observability by implementing monitoring, logging, alerting, and SLOs to detect and respond to issues proactively. 

  • Lead and participate in incident response, root cause analysis, and the creation of actionable postmortems. 

  • Collaborate with development, infrastructure, and support teams to drive reliability-focused architecture and tooling decisions. 

  • Identify and automate manual tasks using scripts, runbooks, and self-healing solutions to reduce operational overhead. 

  • Champion an engineering enablement model—empowering developers to own and operate what they build. 

What You'll Bring 

  • Proven experience supporting and improving distributed software systems in production. 

  • Proficiency in scripting and automation (PowerShell, Python, Bash, Node.js, or similar). 

  • Strong understanding of observability practices and tools (ELK Stack, OTel, AWS CloudWatch, Azure Application Insights). 

  • Hands-on experience with one or more cloud platforms (AWS preferred, Azure/GCP acceptable). 

  • Solid knowledge of CI/CD pipeline automation (GitHub Actions, Azure DevOps). 

  • Experience with infrastructure as code (Terraform and Ansible). 

  • Familiarity with incident response practices, SLAs/SLIs/SLOs, and service ownership models. 

  • Excellent communication and collaboration skills with a mindset for continuous learning and improvement. 

Pay Range: $82,925 - $110,525 Annually 

This hiring range is a reasonable estimate of the base pay range for this position at the time of posting. Pay is based on a number of factors which may include job-related knowledge, skills, experience, business requirements, and geographic location. 

#ST2

What We OfferBy choice, we don’t simply accept individuality – we embrace it, we support it, and we thrive on it! Our People First Culture celebrates diversity, equity and inclusion not simply because it’s the right thing to do, but also because it’s the key to our success. We are proud to foster an authentic and inclusive workplace For All. You are free and encouraged to bring your entire, unique self to work. First American is an equal opportunity employer in every sense of the term.

** Note that the following statements only apply to candidates who will be working from an unincorporated area within Los Angeles County. **

First American will consider for employment all qualified applicants, including those with arrest or conviction records, in a manner consistent with the requirements of applicable state and local laws (e.g., the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act).

First American intends to conduct a review of an applicant’s criminal history in connection with a conditional offer. First American reasonably believes that a criminal history may have a direct, adverse and negative relationship with the following material job duties for this position potentially resulting in the withdrawal of the conditional offer of employment: handling of confidential, proprietary or trade secret information belonging to First American or its customers, administrating or facilitating financial transactions, and the ability to meet customer-imposed criminal history requirements.

Based on eligibility, First American offers a comprehensive benefits package including medical, dental, vision, 401k, PTO/paid sick leave and other great benefits like an employee stock purchase plan.

Top Skills

Ansible
AWS
Azure
Azure Devops
Bash
Elk Stack
Github Actions
Node.js
Otel
Powershell
Python
Terraform

Similar Jobs

2 Days Ago
Easy Apply
Hybrid
San Francisco, CA, USA
Easy Apply
Senior level
Senior level
Cloud • Software
Lead the Production Engineering SRE team, focusing on DevSecOps, system reliability, security architecture, and team mentorship in cloud-native technologies.
Top Skills: ArgocdAWSDockerGoKubernetesOpentelemetryPrometheusPythonTerraform
7 Days Ago
Easy Apply
Hybrid
San Francisco, CA, USA
Easy Apply
Senior level
Senior level
Cloud • Software
The Principal Site Reliability Engineer will oversee mission-critical datastores, ensuring reliability, scalability, and performance while leading automation efforts and mentoring the engineering team.
Top Skills: AWSElasticsearchGoKafkaKubernetesMongoDBMySQLPythonTerraform
6 Days Ago
Remote
San Francisco, CA, USA
Junior
Junior
Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Join the SRE team to enhance cloud services, improve reliability, and automate tasks. Experience with cloud services and programming is key.
Top Skills: AuroraAWSAzureCachesCloudFormationEc2GCPGoJavaLinuxPythonRdsSqsUnix

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.
By clicking Apply you agree to share your profile information with the hiring company.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account