Capco Logo

Capco

ML Engineer (She/He/They)

Posted 3 Days Ago
Be an Early Applicant
Remote or Hybrid
Hiring Remotely in Poland
Mid level
Remote or Hybrid
Hiring Remotely in Poland
Mid level
Design and implement document ingestion pipelines, processing workflows, normalization, indexing into Azure AI Search, automated evaluation and monitoring, CI/CD and containerized services, and ensure security/compliance for knowledge artifacts.
The summary above was generated by AI

Location: Warsaw, Poland - Hybrid

Capco Poland is a leading global technology and management consultancy, dedicated to driving digital transformation across the financial services industry. Our passion lies in helping our clients navigate the complexities of the financial world, and our expertise spans banking and payments, capital markets, wealth, and asset management. We pride ourselves on maintaining a nimble, agile, and entrepreneurial culture, and we are committed to growing our business by hiring top talent.

ROLE OVERWIEW:


This role is responsible for designing and implementing ingestion pipelines, document processing workflows, data normalization, and integrations across Azure services (SharePoint, Microsoft Graph, Azure AI Search, Azure Document Intelligence, App Insights). The engineer will also implement automated evaluation frameworks and quality monitoring mechanisms.

The primary objective of this role is to enable domain agents, task agents, and the master agent to operate on properly prepared, complete, indexed, and secure knowledge artifacts, in full alignment with enterprise security, compliance, and banking standards.

Key Responsibilities
  • Design and develop data connectors (SharePoint, OneDrive, Microsoft Graph, GCP sources, external APIs) to enable automated document and metadata ingestion.

  • Implement ingestion pipelines (batch and event-driven) and processing orchestration mechanisms.

  • Process documents using Azure Document Intelligence, including parsing, OCR, layout extraction, structured data extraction (tables, fields, confidence scoring).

  • Develop transformation and normalization workflows, including data cleaning, segmentation, PII masking, and generation of structured knowledge artifacts.

  • Index content into Azure AI Search and Knowledge Bases, including index design, indexers, skillsets, enrichment pipelines, embeddings, and vector stores.

  • Prepare evaluation datasets (baseline, ground truth, domain-specific test cases).

  • Automate quality evaluation of extraction and indexing processes (precision, recall, fidelity metrics, drift detection).

  • Implement instrumentation, logging, and monitoring using Application Insights and Log Analytics.

  • Optimize document processing costs (batch vs. on-demand processing strategies, layered caching, cost-per-document analysis).

  • Establish CI/CD pipelines and development standards (GitHub Actions, testing, code quality, linting, artifact registry, containerization).

  • Build and maintain containerized services (Docker, Azure Container Registry, Azure Container Apps / AKS / WebApp for Containers).

  • Automate ingestion and evaluation workflows using Azure Functions, Logic Apps, and Durable Functions.

Must-Have Skills & Experience
  • Strong Python skills (asyncio, FastAPI, Pydantic, multiprocessing).

  • Hands-on experience with Azure SDK for Python (Storage, Cognitive Services, AI Search, Application Insights).

  • Practical experience with Microsoft Graph API / SharePoint API (file and metadata retrieval).

  • Azure AI Search: index design, indexers, skillsets, embeddings, vector search.

  • Azure Document Intelligence (OCR, layout extraction, custom models).

  • Testing experience (pytest, integration tests, cloud service mocking).

  • GitHub Actions (build, test, scanning, artifact management, deployment).

  • Docker (image building, multi-stage builds, layer optimization).

  • Logging and monitoring (Application Insights, Log Analytics).

  • Experience working with large document collections (batch ingestion at scale).

  • Data security best practices (PII masking, RBAC, Entra ID integration).

Nice-to-Have
  • MCP Client / MCP Tooling (custom agent integrations).

  • Experience with LangChain or Semantic Kernel (RAG / agent pipelines).

  • GCP experience (BigQuery, Looker, Cloud Functions) within a multi-cloud Knowledge Management architecture.

  • AKS / Azure Container Apps / WebApp for Containers for scalable ingestion services.

  • Durable Functions for orchestrating large document processing workflows.


We offer a flexible collaboration model based on a B2B contract, with the opportunity to work on diverse projects.

Top Skills

Python,Asyncio,Fastapi,Pydantic,Multiprocessing,Azure Sdk For Python,Azure Storage,Azure Cognitive Services,Azure Ai Search,Azure Document Intelligence,Microsoft Graph Api,Sharepoint Api,Application Insights,Log Analytics,Pytest,Github Actions,Docker,Azure Container Registry,Azure Container Apps,Aks,Web App For Containers,Azure Functions,Logic Apps,Durable Functions,Entra Id (Azure Ad),Vector Search,Embeddings
HQ

Capco London, England Office

77-79 Great Eastern Street, London, United Kingdom, EC2A 3HU,

Similar Jobs at Capco

2 Days Ago
Remote or Hybrid
Poland
Mid level
Mid level
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Analyzing financial adjustments for regulatory reporting, focusing on Counterparty Credit Risk and Liquidity. Collaborating with Product Controllers to improve data quality and processes for reporting.
Top Skills: Data QualityProduct ControlRegulatory Reporting
3 Days Ago
Remote or Hybrid
Poland
Entry level
Entry level
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
QA / AI QA Tester responsible for designing test suites and evaluation datasets for LLM/agent systems, performing response quality, PII/compliance, guardrail, performance, integration and UX testing, analyzing telemetry, documenting results, and supporting AI/domain teams on defects and data drift.
Top Skills: Copilot Studio,Azure Ai Search,Azure Openai,Foundry,Document Intelligence,Sharepoint Online,Application Insights,Log Analytics,Python,Pandas,Requests,Rest Apis,Github Actions,Bigquery,Looker,Ocr
3 Days Ago
Remote or Hybrid
Poland
Senior level
Senior level
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Design and govern the Knowledge Management data layer: information models, taxonomies, ingestion pipelines, PII and access controls, Azure AI Search indexing, and integrations with SharePoint, GCP BigQuery/Looker and other data sources to ensure high data quality for AI agents.
Top Skills: Sharepoint Online,Microsoft 365,Azure Ai Search,Azure Knowledge Bases,Dataverse,Google Bigquery,Looker,Microsoft Fabric,Azure Synapse,Graph Api,Rest Api,Ocr,Document Intelligence,Vision Ai,Entra Id,Rbac,Abac,Etl,Elt,Semantic Search

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account