Anthropic · London · Hybrid

Research Engineer, Production Model Post-Training, London

11/11/2025

Description

Anthropic's production models undergo sophisticated post-training processes to enhance their capabilities, alignment, and safety. As a Research Engineer on our Post-Training team, you'll train our base models through the complete post-training stack to deliver the production Claude models that users interact with.

You'll work at the intersection of cutting-edge research and production engineering, implementing, scaling, and improving post-training techniques like Constitutional AI, RLHF, and other alignment methodologies. Your work will directly impact the quality, safety, and capabilities of our production models.

Note: For this role, we conduct all interviews in Python. This role may require responding to incidents on short-notice, including on weekends.

Responsibilities:

  • Implement and optimize post-training techniques at scale on frontier models

  • Conduct research to develop and optimize post-training recipes that directly improve production model quality

  • Design, build, and run robust, efficient pipelines for model fine-tuning and evaluation

  • Develop tools to measure and improve model performance across various dimensions

  • Collaborate with research teams to translate emerging techniques into production-ready implementations

  • Debug complex issues in training pipelines and model behavior

  • Help establish best practices for reliable, reproducible model post-training

Qualifications

  • Thrive in controlled chaos and are energised, rather than overwhelmed, when juggling multiple urgent priorities

  • Adapt quickly to changing priorities

  • Maintain clarity when debugging complex, time-sensitive issues

  • Have strong software engineering skills with experience building complex ML systems

  • Are comfortable working with large-scale distributed systems and high-performance computing

  • Have experience with training, fine-tuning, or evaluating large language models

  • Can balance research exploration with engineering rigor and operational reliability

  • Are adept at analyzing and debugging model training processes

  • Enjoy collaborating across research and engineering disciplines

  • Can navigate ambiguity and make progress in fast-moving research environments

  • Have experience with LLMs

  • Have a keen interest in AI safety and responsible deployment

Benefits

£270,000 - £340,000 GBP

Application

View listing at origin and apply!