Ripjar Logo

Ripjar

Data Engineer

Posted 2 Days Ago
Be an Early Applicant
In-Office or Remote
4 Locations
Junior
In-Office or Remote
4 Locations
Junior
Build and operate distributed ingestion and processing pipelines, ensure reliability and performance, define data contracts, add observability and testing, improve platform reliability and CI/CD, and participate in design/code reviews and incident retrospectives.
The summary above was generated by AI

About Ripjar

Ripjar specialises in the development of software and data products that help governments and organisations combat serious financial crime. Our technology is used to identify criminal activity such as money laundering and terrorist financing, enabling organisations to enforce sanctions at scale to help combat rogue entities and state actors.

Data infuses everything Ripjar does. We work with a wide variety of datasets of all scales, including an ever-growing archive of billions of news articles covering most languages going back over 30 years, sanctions and watchlist data provided by governments, and vast organisation and ownership datasets.

About the Role

We see a Data Engineer as a software engineer who specialises in distributed data systems. You’ll join the Data Engineering team, whose prime responsibility is the development and operation of the Data Collection Hub, a platform that ingests data from many sources, processes/enriches it, and distributes it to multiple downstream systems.

We’re looking for someone with 2+ years of industry experience building and operating production software who enjoys working across data pipelines, distributed systems, and operational reliability.

What you’ll do

  • Engineer distributed ingestion services that reliably pull data from diverse sources, handle messy real-world edge cases, and deliver clean, well-structured outputs to multiple downstream products.
  • Build high-throughput processing components (batch and/or near-real-time) with a focus on performance, scalability, and predictable cost, using strong profiling and measurement practices.
  • Design and evolve data contracts (schemas, validation rules, versioning, backward compatibility) so downstream teams can build with confidence.
  • Own production quality: write maintainable code, strong unit/integration tests, and add the observability you need (metrics/logs/tracing) to diagnose issues quickly.
  • Improve platform reliability by hardening pipelines against partial failures, retries, rate limits, data drift, and infrastructure issues—then codify those learnings into better tooling and guardrails.
  • Contribute to CI/CD and developer experience: faster builds, better test signal, safer releases, and automated operational checks.
  • Participate in design reviews, code reviews, incident retrospectives, and iterative delivery—making pragmatic trade-offs and documenting them clearly.

Technology Stack

  •  Languages: Predominantly Python and Node.js
  • Distributed/data platforms: HDFS, HBase, Spark, plus increasing use of Kubernetes and cloud services
  • Storage/search: MongoDB, OpenSearch
  • Orchestration: Airflow, Dagster, NiFi
  • Tooling: GitHub, GitHub Actions, Rundeck, Jira, Confluence
  • Deployment/config: Ansible (physical), Terraform / Argo CD / Helm (Kubernetes)
  • Development environment: MacBook (typical)

Requirements

Essential:

  • 2+ years building and operating production software systems
  • Fluency in at least one programming language (Python/Node.js a plus)
  • Experience debugging moderately complex systems and improving reliability/performance
  • Strong fundamentals: data structures, testing, version control, Linux basics

Nice to have:

  • Spark/PySpark experience
  • Hadoop ecosystem exposure (HDFS/HBase)
  • Workflow orchestration (Airflow/Dagster/NiFi)
  • Search/indexing (OpenSearch, MongoDB)
  • Kubernetes and infrastructure-as-code
  • Degree in Computer Science or numerical degree

Benefits
  • Competitive salary DOE
  • 25 days annual leave + your birthday off, in addition to bank holidays, rising to 30 days after 5 years of service.
  • Remote working
  • Private Family Healthcare.
  • 35 hour working week.
  • Employee Assistance Programme.
  • Company contributions to your pension.
  • Pension salary sacrifice.
  • Enhanced maternity/paternity pay.
  • The latest tech including a top of the range MacBook Pro.

Top Skills

Python,Node.Js,Pyspark,Spark,Hdfs,Hbase,Hadoop,Kubernetes,Mongodb,Opensearch,Airflow,Dagster,Nifi,Github,Github Actions,Rundeck,Jira,Confluence,Ansible,Terraform,Argo Cd,Helm

Ripjar London, England Office

20 Saint Thomas Street, Runway East, London, United Kingdom, SE1 9RS

Similar Jobs

Yesterday
In-Office or Remote
London, Greater London, England, GBR
Mid level
Mid level
Information Technology • Consulting
Contract Data Engineer to review Power Apps implementation, support end-to-end testing and remediation, and drive migration from Development to UAT. Must be skilled in Microsoft Fabric, Python/PySpark, Power Apps, data modelling, and working with lakehouses.
Top Skills: LakehouseMicrosoft FabricMicrosoft Power AppsPysparkPython
Yesterday
In-Office or Remote
2 Locations
Mid level
Mid level
HR Tech • Payments • Software • Financial Services
Design, build, and maintain large-scale data pipelines using Microsoft Fabric and Databricks. Develop data architectures, ensure data quality and governance, optimize processing and storage, integrate with ML and analytics, implement monitoring, and collaborate with Product and Engineering to prioritize data initiatives.
Top Skills: Microsoft Fabric,Databricks,Python,Sql,Scala,Java,Apache Spark,Apache Beam,Azure Data Factory,Azure,Aws,Azure Synapse Analytics,Azure Data Lake Storage,Git
6 Days Ago
In-Office or Remote
London, Greater London, England, GBR
Senior level
Senior level
Healthtech
Design, build and maintain Kooth's cloud-native data infrastructure and core data models. Collaborate with analysts and data scientists to support analytics and ML needs, ensure performance, resilience, security and observability, guide technical direction, and participate in an on-call rota.
Top Skills: Python,Dagster,Dbt,Mlflow,Streaming Applications,Google Bigquery,Kubernetes,Cloudfunctions,Gcp,Azure Pipelines,Github Actions,Docker,Terraform,Event-Sourcing

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account