Hebbia

Software Engineer, Site Reliability

Reposted 7 Days Ago

Easy Apply

In-Office

New York City, NY

Senior level

Easy Apply

In-Office

New York City, NY

Senior level

Own and improve critical production services end-to-end by writing production-quality code: instrumenting services, eliminating performance bottlenecks, building deployment and observability platforms, defining SLOs, running incident response and post-mortems, capacity planning and cost optimization, maintaining CI/CD, and embedding with product teams to design reliable systems.

The summary above was generated by AI

About Hebbia

The AI platform for investors and bankers that generates alpha and drives upside.

Founded in 2020 by George Sivulka and backed by Peter Thiel and Andreessen Horowitz, Hebbia powers investment decisions for BlackRock, KKR, Carlyle, Centerview, and 40% of the world’s largest asset managers. Our flagship product, Matrix, delivers industry-leading accuracy, speed, and transparency in AI-driven analysis. It is trusted to help manage over $30 trillion in assets globally.

We deliver the intelligence that gives finance professionals a definitive edge. Our AI uncovers signals no human could see, surfaces hidden opportunities, and accelerates decisions with unmatched speed and conviction. We do not just streamline workflows. We transform how capital is deployed, how risk is managed, and how value is created across markets.

Hebbia is not a tool. Hebbia is the competitive advantage that drives performance, alpha, and market leadership.

The Role

We are looking for a Site Reliability Engineer who thinks like a software engineer first. You will own critical production systems end-to-end, designing, building, and improving them rather than simply operating them. You will write production-quality code that keeps the platform reliable at scale, embed with product
engineering teams to influence architecture from the start, and build the internal tooling that every engineer at Hebbia depends on. This is not a ticket-driven ops role. You will spend most of your time writing code: instrumenting services, eliminating performance bottlenecks, building deployment platforms, and translating incident post-mortems into lasting architectural improvements.

Responsibilities

Own critical production services end-to-end, from design and code review through deployment,
operation, and incident response
Profile, benchmark, and rewrite hot paths to eliminate bottlenecks as Hebbia scales
Lead incident response and drive post-mortem culture, translating findings into code changes and
architectural improvements rather than runbooks
Design and build observability frameworks from scratch, writing custom instrumentation, alerting
logic, and debugging tooling that surfaces production issues before customers feel them
Define and enforce SLOs across platform services and build the feedback loops that keep
engineering teams accountable to them
Own capacity planning and cost efficiency: model growth, right-size infrastructure, and write
automation that prevents over-provisioning and resource exhaustion
Build robust, well-tested internal platforms and deployment tooling held to the same engineering
standards as customer-facing code
Own and continuously improve CI/CD systems so engineering teams can ship safely and quickly
Embed with product engineering teams as a peer software engineer, contributing directly to
production codebases and co-designing systems for reliability from the start
Partner on infrastructure security through threat modeling, hardening, and automated compliance
tooling

Who You Are

5+ years software development with a track record of writing, shipping, and maintaining production services, not just operating infrastructure
Production-grade proficiency in at least one systems or backend language: Go, Python, C++, or Rust
Proven experience as a Production Engineer, SRE, or software engineer with a deep infrastructure focus, comfortable owning services end-to-end across the full stack
Deep understanding of distributed systems
Container orchestration expertise and hands-on experience debugging complex distributed failures in production
Working knowledge of OS-level concepts
Cloud platform fluency (AWS preferred)
Experience in building and maintaining observability stacks
Strong CI/CD pipeline expertise and a track record of improving developer velocity without sacrificing safety
Background at a company with a Production Engineering or software-focused SRE culture is a strong plus
Experience building platforms for AI/ML workloads or high-throughput document processing pipelines is a plus

Compensation

The salary range for this role is $160,000 to $300,000. This range may be inclusive of several career levels at Hebbia and will be narrowed during the interview process based on the candidate’s experience and qualifications. Adjustments outside of this range may be considered for candidates whose qualifications significantly differ from those outlined in the job description.

Life @ Hebbia

PTO: Unlimited

Insurance: Medical + Dental + Vision + 401K

Eats: Catered lunch daily + doordash dinner credit if you ever need to stay late

Parental leave policy: 3 months non-birthing parent, 4 months for birthing parent

Fertility benefits: $15k lifetime benefit

New hire equity grant: competitive equity package with unmatched upside potential

#LI-Onsite

Top Skills

AWS

C++

Ci/Cd

Container Orchestration

Observability Stacks

Python

Rust

47 Great Marlborough Street, London, United Kingdom, W1F 7JP

Similar Jobs at Hebbia

Hebbia

Engineering Manager

8 Hours Ago

Easy Apply

In-Office

Easy Apply

Senior level

Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Financial Services • Generative AI

As an Engineering Manager, you will lead a team, set technical direction, collaborate on product development, and maintain operational excellence while writing code and coaching engineers.

Top Skills: AWSGoPythonTypescript

Hebbia

Forward Deployed Engineer

8 Hours Ago

Easy Apply

In-Office

Easy Apply

Senior level

Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Financial Services • Generative AI

The Forward Deployed Engineer builds custom solutions on Hebbia's platform for clients, writing production code and embedding directly with customer teams for technical relationship management.

Top Skills: AWSETLGoPythonS3SharepointSnowflakeTypescript

Hebbia

Data Scientist

8 Hours Ago

Easy Apply

In-Office

Easy Apply

Mid level

Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Financial Services • Generative AI

The role involves defining product metrics, building analytics infrastructure, analyzing user behavior, and informing product decisions through data-driven insights.

Top Skills: AirflowBigQueryDbtPythonSnowflakeSQL

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Hebbia

Software Engineer, Site Reliability

Top Skills

Hebbia London, England Office

Similar Jobs at Hebbia

Engineering Manager

Forward Deployed Engineer

Data Scientist

What you need to know about the London Tech Scene