Capco

Tester AI QA (Polish is Mandatory) (She/He/They)

Posted 3 Days Ago

Be an Early Applicant

Remote or Hybrid

Hiring Remotely in Poland

Entry level

Remote or Hybrid

Hiring Remotely in Poland

Entry level

QA / AI QA Tester responsible for designing test suites and evaluation datasets for LLM/agent systems, performing response quality, PII/compliance, guardrail, performance, integration and UX testing, analyzing telemetry, documenting results, and supporting AI/domain teams on defects and data drift.

The summary above was generated by AI

CAPCO POLAND

Location: Warsaw, Poland

Pref. work model - 3x per week from office

At Capco Poland, we’re not just another consultancy - we’re the spark behind digital transformation in the financial world. As a global leader in technology and management consulting, we thrive on helping clients tackle the toughest challenges across banking, payments, capital markets, wealth, and asset management.

Our secret?

A culture that’s fast, flexible, and fiercely entrepreneurial. We move quickly, think creatively, and always put our people first.

We’re passionate about growth - both for our clients and ourselves - and that means attracting the very best talent to join us on this exciting journey.

We’re proud to be:
• Trailblazers in banking, payments, capital markets, wealth, and asset management
• Champions of an agile, nimble, and innovative work environment
• Dedicated to building a team of top-notch professionals who share our drive and vision

ROLE OVERVIEW

We’re looking for a detail-oriented QA / AI QA Tester with experience in testing LLM- or agent-based systems (or strong QA experience with a focus on AI), who brings a structured approach to designing test cases and evaluation datasets, understands AI quality metrics, and is passionate about improving the reliability, stability, and overall quality of enterprise AI solutions.

Fluency in Polish is mandatory.

KEY RESPONSIBILITIES:

Design and maintain business test suites (functional, scenario-based, regression) for the Master Agent and domain agents.
Build evaluation datasets (PL/EN, domain-specific), including positive/negative queries, edge cases, and out-of-scope scenarios.
Perform response quality evaluation using metrics such as:
- Accuracy
- Top-k recall
- Groundedness
- Hallucination rate
- Refusal policy compliance
Conduct PII and compliance testing: validation of masking, anonymization, and sensitive data handling.
Test guardrails, including:
- Undesired output handling
- Prompt security testing
- “I don’t know” policy enforcement
Perform performance and resilience testing: latency, SLA compliance, pipeline stability.
Validate conversational UX (conversation flow, intent recognition, fallback handling, language detection).
Test integrations with:
- Copilot Studio
- Azure AI Search
- Azure OpenAI / Foundry
- Document Intelligence
- SharePoint Online
Analyze logs and telemetry (App Insights, Log Analytics) and identify anomalies.
Document test results, recommendations, and ensure traceability of test cases.
Support AI and domain teams in diagnosing defects, data drift, and quality regression.
Participate in periodic knowledge quality reviews and verify compliance with KM governance rules.

KEY TECHNOLOGIES USED BY THE TEAM:

Copilot Studio (knowledge agents)
Azure AI Search, Azure OpenAI / Foundry
Document Intelligence (OCR, table extraction)
SharePoint Online (knowledge sources)
App Insights + Log Analytics (telemetry)
Python (pandas, requests)
GitHub Actions (CI/CD)
BigQuery / Looker (analytics)

SKILLS & EXPERIENCES TO GET THE JOB DONE:

Experience in testing LLM-based or agent-based systems, or classical QA experience with a strong interest in transitioning to AI QA.
Ability to design business scenarios, test cases, and evaluation datasets.
Basic Python skills (pandas, REST APIs, simple evaluation scripts).
Familiarity with Copilot Studio and integration with domain agents.
Basic knowledge of Azure AI Search, SharePoint Online, and Document Intelligence (ability to interpret OCR/DI outputs).
Understanding of automated evaluation methods (LLM scoring, auxiliary models, benchmark evaluation).

Nice to have:

Experience with multicloud testing (GCP BigQuery/Looker, Azure, optionally Fabric).
Experience with Document Intelligence in the context of OCR and table extraction quality assessment.
Experience working with GitHub Actions (CI) and automated testing pipelines.
Basic understanding of the MCP protocol in agent-based systems.
Experience in data drift analysis and automated evaluation frameworks.

IMPORTANT

Fluent Polish (spoken and written) – mandatory.
Good command of English for documentation and collaboration.
Availability to work on-site, with partial remote work - 3 days per week from the office in Warsaw.

ONLINE RECRUITMENT PROCESS STEPS

Screening call with the Recruiter
Hiring Manager Technical Interview
Client Interview
Feedback/Offer

We offer a flexible collaboration model based on a B2B contract, with the opportunity to work on diverse projects.

Top Skills

Copilot Studio,Azure Ai Search,Azure Openai,Foundry,Document Intelligence,Sharepoint Online,Application Insights,Log Analytics,Python,Pandas,Requests,Rest Apis,Github Actions,Bigquery,Looker,Ocr

77-79 Great Eastern Street, London, United Kingdom, EC2A 3HU,

Similar Jobs at Capco

Capco

Business Analyst

2 Days Ago

Remote or Hybrid

Poland

Mid level

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI

Analyzing financial adjustments for regulatory reporting, focusing on Counterparty Credit Risk and Liquidity. Collaborating with Product Controllers to improve data quality and processes for reporting.

Top Skills: Data QualityProduct ControlRegulatory Reporting

Capco

Machine Learning Engineer

3 Days Ago

Remote or Hybrid

Poland

Mid level

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI

Design and implement document ingestion pipelines, processing workflows, normalization, indexing into Azure AI Search, automated evaluation and monitoring, CI/CD and containerized services, and ensure security/compliance for knowledge artifacts.

Top Skills: Python,Asyncio,Fastapi,Pydantic,Multiprocessing,Azure Sdk For Python,Azure Storage,Azure Cognitive Services,Azure Ai Search,Azure Document Intelligence,Microsoft Graph Api,Sharepoint Api,Application Insights,Log Analytics,Pytest,Github Actions,Docker,Azure Container Registry,Azure Container Apps,Aks,Web App For Containers,Azure Functions,Logic Apps,Durable Functions,Entra Id (Azure Ad),Vector Search,Embeddings

Capco

Data Architect

3 Days Ago

Remote or Hybrid

Poland

Senior level

Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI

Design and govern the Knowledge Management data layer: information models, taxonomies, ingestion pipelines, PII and access controls, Azure AI Search indexing, and integrations with SharePoint, GCP BigQuery/Looker and other data sources to ensure high data quality for AI agents.

Top Skills: Sharepoint Online,Microsoft 365,Azure Ai Search,Azure Knowledge Bases,Dataverse,Google Bigquery,Looker,Microsoft Fabric,Azure Synapse,Graph Api,Rest Api,Ocr,Document Intelligence,Vision Ai,Entra Id,Rbac,Abac,Etl,Elt,Semantic Search

What you need to know about the London Tech Scene

London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.

Capco

Tester AI QA (Polish is Mandatory) (She/He/They)

Top Skills

Capco London, England Office

Similar Jobs at Capco

Business Analyst

Machine Learning Engineer

Data Architect

What you need to know about the London Tech Scene