Anthropic · San Francisco/New York City/Seattle · Hybrid

Research Engineer, Reward Models Training

1/13/2025

Description

  • Own the end-to-end engineering of reward model training, from data ingestion through model evaluation and deployment
  • Design and implement efficient, reliable training pipelines that can scale to increasingly large model sizes
  • Build robust data pipelines for collecting, processing, and incorporating human feedback into reward model training
  • Optimize training infrastructure for throughput, efficiency, and fault tolerance across distributed systems
  • Extend reward model capabilities to support new domains and additional data modalities
  • Collaborate with researchers to implement and iterate on novel reward modeling techniques
  • Develop tooling and monitoring systems to ensure training quality and identify issues early
  • Contribute to the design and improvement of our overall model training infrastructure

Qualifications

  • Have significant experience building and maintaining large-scale ML systems
  • Are proficient in Python and have experience with ML frameworks such as PyTorch
  • Have experience with distributed training systems and optimizing ML workloads for efficiency
  • Are comfortable working with large datasets and building data pipelines at scale
  • Can balance research exploration with engineering rigor and operational reliability
  • Enjoy collaborating closely with researchers and translating research ideas into reliable engineering systems
  • Are results-oriented with a bias towards flexibility and impact
  • Can navigate ambiguity and make progress in fast-moving research environments
  • Adapt quickly to changing priorities, while juggling multiple urgent issues
  • Maintain clarity when debugging complex, time-sensitive issues
  • Pick up slack, even if it goes outside your job description
  • Care about the societal impacts of your work and are motivated by Anthropic's mission
  • Training or fine-tuning large language models
  • Reinforcement learning from human feedback (RLHF) or related techniques
  • GPUs, Kubernetes, and cloud infrastructure (AWS, GCP)
  • Building systems for human-in-the-loop machine learning
  • Working with multimodal data (text, images, audio, etc.)
  • Large-scale ETL and data processing frameworks (Spark, Airflow)
  • Scaling reward model training to handle models with significantly more parameters while maintaining training stability
  • Building a unified data pipeline that ingests human feedback from multiple sources and formats for reward model training
  • Implementing fault-tolerant training infrastructure that gracefully handles hardware failures during long training runs
  • Developing evaluation frameworks to measure reward model quality across diverse domains
  • Optimizing training throughput to reduce iteration time on reward modeling experiments

Benefits

$350,000 - $500,000 USD

Application

View listing at origin and apply!