Augury Logo

Augury

ML Ops Engineer

Posted 8 Days Ago
Be an Early Applicant
Easy Apply
Remote
Hiring Remotely in Haifa
Senior level
Easy Apply
Remote
Hiring Remotely in Haifa
Senior level
Design, build, and operate production-grade MLOps platform capabilities across the ML lifecycle: data, features, training, deployment, monitoring, retraining, lineage, and evaluation. Implement experiment tracking, artifact management, orchestration, and observability for large-scale training and operational AI systems (including LLM/agentic systems). Deliver reusable platform tooling, CI/CD, testing standards, and production reliability to scale Industrial AI applications.
The summary above was generated by AI

Our mission is to transform how people and machines work together to push the boundaries of human productivity. A leader in Industrial AI, Augury helps the world’s manufacturers leverage real-time production insights to drive new levels of efficiency. Combining predictive and prescriptive AI technology with industry expertise, production teams can proactively address alerts, minimize downtime, reduce asset costs, and maximize yield and capacity. Our customers achieve payback in six months or less, enabling global scale. We're looking for team members excited to partner with the world's manufacturers and build the future of production together.

We are looking for a MLOps Engineer with strong production engineering experience building and operating scalable ML and AI systems.

This is a software-first MLOps platform role focused on production reliability, ML lifecycle management, large-scale training infrastructure, operational AI systems, and reusable platform capabilities.

You will help build and scale the production platform behind Augury’s Industrial AI Workforce, enabling teams across the company to develop, evaluate, deploy, and operate ML and AI systems consistently and safely.

A Day In Your Life

  • Design and evolve production MLOps capabilities across the full ML lifecycle including datasets, features, models, evaluations, deployments, monitoring, retraining, and feedback signals.
  • Build systems for experiment tracking, artifact management, reproducibility, versioning, lineage, promotion workflows, and production readiness.
  • Develop reusable platform tooling, golden paths, and engineering standards that improve consistency and delivery velocity across teams.
  • Build operational infrastructure for LLM and agentic systems including prompts, tools, traces, evaluations, observability, safety boundaries, and production monitoring.
  • Design evaluation and monitoring frameworks for AI systems including answer quality, latency, grounding, reliability, and operational regressions.
  • Build and optimize large-scale training pipelines supporting heterogeneous data sources and scalable compute patterns.
  • Write clean, modular, production-grade Python services and platform libraries.
  • Drive engineering quality through automated testing, CI/CD, observability, deployment standards, and operational best practices.

What You Bring

  • 5+ years of professional software engineering, MLOps, or ML platform engineering experience in production environments.
  • Significant experience building or owning production ML infrastructure and lifecycle systems.
  • Strong Python engineering skills with production-grade architecture, modular design, testing, packaging, and robust error handling.
  • Strong understanding of the end-to-end ML lifecycle including training, deployment, monitoring, retraining, reproducibility, and lineage.
  • Experience working with large-scale data platforms such as Databricks, Spark, Delta Lake, or equivalent ecosystems.
  • Experience with ML platform and MLOps frameworks such as MLflow, Metaflow, Kubeflow, or equivalent ML lifecycle-management systems.
  • Proven ability to design reusable workflow orchestration using Airflow, Metaflow, or Databricks, covering automation, scheduling, dependency management, and production reliability.
  • Familiarity with operational patterns for LLMOps, AgentOps, and production AI systems.
  • Strong written and verbal communication skills in English.

Nice to Have

  • Experience with industrial, IoT or manufacturing platforms.
  • Experience with feature stores, model registries, dataset versioning, and lineage systems.
  • Experience with AI agents, RAG systems, production GenAI applications, or evaluation frameworks.

Why This Role Matters

This role is an opportunity to help build the production foundation behind Augury’s Industrial AI Workforce.

You will help transform ML and AI work from isolated experimentation into scalable, observable, reliable, and reusable production systems powering the next generation of industrial AI.

If you enjoy building production-grade AI platforms, scaling ML systems on modern data infrastructure, and shaping the operational backbone of Industrial AI, we would love to meet you.

 

Augury is a people-first organization. We believe in fostering an inclusive environment in which employees feel encouraged to share their unique perspectives, leverage their strengths, and act authentically. We know that diverse teams are strong teams, and we welcome those from all backgrounds and varying experiences. We are committed to providing employees with a work environment free of discrimination and harassment. We believe that diversity is more than just good intentions, and we are committed to creating an inclusive environment for all employees.

Augury is a proud equal opportunity employer, we strive to create a work environment in which everyone, all applicants, employees, customers, guests, and vendors feel safe and comfortable. We commit to maintain a workplace that is free of any type of harassment and does not tolerate anyone intimidating, humiliating, or hurting others. We prohibit willful discrimination based on age, gender, ethnicity, race, color, religion, political opinions, sexual orientation, sexual identity or expression, military or veteran status, disability or any other characteristic protected by law.

Similar Jobs at Augury

14 Hours Ago
Easy Apply
Remote
Easy Apply
Senior level
Senior level
Artificial Intelligence • Hardware • Internet of Things • Machine Learning • Software • Manufacturing
The Algorithm and Applied AI Scientist (DSP) will lead projects focused on signal processing and data analytics, develop predictive models, and collaborate with product teams to provide data-driven solutions for industrial manufacturing.
Top Skills: Deep LearningDspGenerative AiPythonTgnnTransformersTsfm
10 Days Ago
Easy Apply
In-Office or Remote
Easy Apply
Mid level
Mid level
Artificial Intelligence • Hardware • Internet of Things • Machine Learning • Software • Manufacturing
As an AI Applied Scientist, you'll lead the algorithm lifecycle, develop predictive models using various data sources, and collaborate with teams to create innovative solutions for industrial manufacturers.
Top Skills: DatabricksGrafanaMetaflowPythonSagemakerVertex Ai
10 Days Ago
Easy Apply
In-Office or Remote
Easy Apply
Senior level
Senior level
Artificial Intelligence • Hardware • Internet of Things • Machine Learning • Software • Manufacturing
As an AI Applied Scientist, you will own the algorithm lifecycle, developing and deploying models from data analysis to production, while collaborating with customers and product teams.
Top Skills: Ai FrameworksCloud-Based Big Data PlatformsGenaiLlmsPythonTransformers

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account