The AI Engineer will oversee AI model lifecycle from training to deployment, ensuring performance and reliability through high-quality Python coding and infrastructure management.
AI Engineer
Position Overview
We are seeking an AI Engineer to join our Global Analytics team in London. This role is focused on the end-to-end lifecycle of production-grade AI, from training and fine-tuning specialized models to architecting high-performance inference pipelines.
The ideal candidate views AI as a rigorous engineering discipline. Beyond building models, you will be responsible for writing high-quality, maintainable Python code and ensuring that every solution—whether a voice agent or a document processor—is built for reliability, low latency, and global scale.
Key Responsibilities
- Model Training & Fine-Tuning: Lead the adaptation of Large Language Models (LLMs) for domain-specific tasks using techniques like LoRA, QLoRA, and PEFT to balance performance with resource efficiency.
- Inference Optimization: Architect and optimize inference pipelines to minimize TTFT (Time to First Token) and maximize throughput. This includes implementing quantization, caching strategies, and efficient batching.
- Production Engineering: Build and maintain real-time AI pipelines using WebSockets and SSE, ensuring seamless low-latency delivery for voice (ASR/TTS) and text applications.
- Architecture & MLOps: Deploy and orchestrate models within containerized microservice architectures (Docker/Kubernetes), ensuring robust monitoring, security, and scalability.
- Collaborative Delivery: Work closely with Business Analysts and internal stakeholders to bridge the gap between commercial requirements and technical implementation.
Qualifications
Technical Requirements
- Professional Experience: 5+ years in AI/ML engineering with a documented history of moving complex models from research into production.
- Python Mastery: Deep proficiency in Python. You have a strong commitment to clean coding standards (SOLID/DRY), modular design, and comprehensive unit/integration testing.
- Generative AI Deep Dive: Hands-on experience with LLM training cycles, parameter-efficient fine-tuning (PEFT), and sophisticated prompt engineering.
- Inference Stack: Experience with high-performance inference servers (e.g., vLLM, TGI, or Triton) and an understanding of how to optimize models for GPU deployment.
- Infrastructure: Comfortable working in Linux-based environments and proficient in managing containerized workloads and automated CI/CD pipelines.
- Advanced RAG: Experience building production-ready Retrieval-Augmented Generation systems, including vector database management and semantic search optimization.
Preferred Qualifications
- Experience in the insurance or financial services sector.
- Deep knowledge of GPU architecture, CUDA, and hardware-level performance optimization.
- Familiarity with Document Intelligence frameworks (OCR, layout analysis, and multimodal extraction).
- MUST be fluent in Mandarin
Top Skills
Docker
Kubernetes
Lora
Peft
Python
Qlora
Sse
Tgi
Triton
Vllm
Websockets
Similar Jobs
Beauty • Robotics • Design • Appliances • Manufacturing
The Lead SS&A Analyst will drive data-led decision-making, develop category strategies, perform financial analysis, and lead cross-functional collaboration in the EMEA coffee category.
Top Skills:
Bi And Analytics ToolsExcelPowerPoint
Beauty • Robotics • Design • Appliances • Manufacturing
The Social and Creator Executive will oversee influencer strategy, managing scouting, negotiations, content review, performance analysis, and cross-functional collaboration to optimize brand engagement.
Top Skills:
Influencer MarketingSocial Media Marketing
Big Data • Fintech • Information Technology • Business Intelligence • Financial Services • Cybersecurity • Big Data Analytics
Lead Operational Resilience and Business Continuity for the UK business on a 3-month FTC: set strategy, map important services, build and maintain severe-but-plausible scenarios, run scenario and impact tolerance testing, report outputs, and drive remediation and governance to meet regulatory expectations.
Top Skills:
Iso 22301
What you need to know about the London Tech Scene
London isn't just a hub for established businesses; it's also a nursery for innovation. Boasting one of the most recognized fintech ecosystems in Europe, attracting billions in investments each year, London's success has made it a go-to destination for startups looking to make their mark. Top U.K. companies like Hoptin, Moneybox and Marshmallow have already made the city their base — yet fintech is just the beginning. From healthtech to renewable energy to cybersecurity and beyond, the city's startups are breaking new ground across a range of industries.


