ThetaRay Logo

ThetaRay

Data Engineer

Posted 14 Days Ago
Be an Early Applicant
In-Office or Remote
Hiring Remotely in Madrid, Comunidad de Madrid
Junior
In-Office or Remote
Hiring Remotely in Madrid, Comunidad de Madrid
Junior
The Data Engineer will design, implement, and optimize data pipeline flows, create data tools for analytics, and support data scientists in detecting money laundering activities through data transformation.
The summary above was generated by AI

About ThetaRay:

ThetaRay is a trailblazer in AI-powered Anti-Money Laundering (AML) solutions, offering cutting-edge technology to fintechs, banks, and regulatory bodies worldwide. Our mission is to enhance trust in financial transactions, ensuring compliant and innovative business growth. 

Our technology empowers customers to expand into new markets and introduce groundbreaking products.

Why Join ThetaRay?

At ThetaRay, you'll be part of a dynamic global team committed to redefining the financial services sector through technological innovation. You will contribute to creating safer financial environments and have the opportunity to work with some of the brightest minds in AI, ML, and financial technology. We offer a collaborative, inclusive, and forward-thinking work environment where your ideas and contributions are valued and encouraged.

Join us in our mission to revolutionize the financial world, making it safer and more trustworthy for millions worldwide. Explore exciting career opportunities at ThetaRay – where innovation meets purpose.

We are looking for a Data Engineer to join our growing team of data experts. As a Data Engineer, you will be responsible for designing, implementing, and optimizing data pipeline flows within the ThetaRay system. You will support our data scientists with the implementation of the relevant data flows based on the data scientist’s features design and construct complex rules to detect money laundering activity.

The ideal candidate has experience in building data pipelines and data transformations and enjoys optimizing data flows and building them from the ground up. They must be self-directed and comfortable supporting multiple production implementations for various use cases.


Responsibilities

  • Implement and maintain data pipeline flows in production within the ThetaRay system based on the data scientist’s design
  • Design and implement solution-based data flows for specific use cases, enabling the applicability of implementations within the ThetaRay product
  • Building a Machine Learning data pipeline
  • Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader
  • Work with product, R&D, data, and analytics experts to strive for greater functionality in our systems
  • Train customer data scientists and engineers to maintain and amend data pipelines within the product
  • Travel to customer locations both domestically and abroad
  • Build and manage technical relationships with customers and partners
  • 2+ years of Hands-on experience working with Apache Spark - must
  • Hands-on experience with SQL
  • Hands-on experience with version-control tools such as GIT
  • Hands-on experience with Apache Hadoop Ecosystem including Hive, Impala, Hue, HDFS, Sqoop etc..
  • Experience with Python (Pandas)
  • Experience with PySpark/Scala/Java/R
  • Hands-on experience with data transformation, validations, cleansing, and ML feature engineering
  • BSc degree or higher in Computer Science, Statistics, Informatics, Information Systems, Engineering, or another quantitative field
  • Experience working with and optimizing big data pipelines, architectures, and data sets - an advantage
  • Strong analytic skills related to working with structured and semi-structured datasets
  • Build processes supporting data transformation, data structures, metadata, dependency, and workload management
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
  • Business-oriented and able to work with external customers and cross-functional teams
  • Fluent in English & Spanish both written and spoken

Nice to have

  • Experience with Linux
  • Experience in building Machine Learning pipeline
  • Experience with Elasticsearch
  • Experience with Zeppelin/Jupyter
  • Experience with workflow automation platforms such as Jenkins or Apache Airflow
  • Experience with Microservices architecture components, including Docker and Kubernetes.

Top Skills

Apache Hadoop
Spark
Git
Hdfs
Hive
Hue
Impala
Java
Pandas
Pyspark
Python
R
Scala
SQL
Sqoop

Similar Jobs

17 Days Ago
Remote
28 Locations
Mid level
Mid level
Information Technology
Design, build, and optimize scalable data platforms using Azure and Databricks. Responsible for developing ETL/ELT pipelines, maintaining data quality, and collaborating with analytics teams.
Top Skills: AdfAirflowSparkArm TemplatesAzureDatabricksNoSQLPythonSQLTerraform
17 Days Ago
In-Office or Remote
Madrid, Comunidad de Madrid, ESP
Mid level
Mid level
Information Technology
Design, build, and optimize scalable data platforms on Azure and Databricks, develop ETL/ELT pipelines, and ensure data quality and governance.
Top Skills: AdfAirflowSparkArm TemplatesAzureAzure PurviewDatabricksNoSQLPythonSQLTerraformUnity Catalog
5 Days Ago
In-Office or Remote
8 Locations
Senior level
Senior level
Healthtech
Lead development of data-centric and AI-powered products, translating business needs into technical specifications, and collaborating with cross-functional teams to deliver solutions.
Top Skills: AzureDatabricksSnowflake

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account