Anthropic · San Francisco/New York City/Seattle · Hybrid
Research Engineer, Reward Models Training
1/13/2025
Description
- Own the end-to-end engineering of reward model training, from data ingestion through model evaluation and deployment
- Design and implement efficient, reliable training pipelines that can scale to increasingly large model sizes
- Build robust data pipelines for collecting, processing, and incorporating human feedback into reward model training
- Optimize training infrastructure for throughput, efficiency, and fault tolerance across distributed systems
- Extend reward model capabilities to support new domains and additional data modalities
- Collaborate with researchers to implement and iterate on novel reward modeling techniques
- Develop tooling and monitoring systems to ensure training quality and identify issues early
- Contribute to the design and improvement of our overall model training infrastructure
Qualifications
- Have significant experience building and maintaining large-scale ML systems
- Are proficient in Python and have experience with ML frameworks such as PyTorch
- Have experience with distributed training systems and optimizing ML workloads for efficiency
- Are comfortable working with large datasets and building data pipelines at scale
- Can balance research exploration with engineering rigor and operational reliability
- Enjoy collaborating closely with researchers and translating research ideas into reliable engineering systems
- Are results-oriented with a bias towards flexibility and impact
- Can navigate ambiguity and make progress in fast-moving research environments
- Adapt quickly to changing priorities, while juggling multiple urgent issues
- Maintain clarity when debugging complex, time-sensitive issues
- Pick up slack, even if it goes outside your job description
- Care about the societal impacts of your work and are motivated by Anthropic's mission
- Training or fine-tuning large language models
- Reinforcement learning from human feedback (RLHF) or related techniques
- GPUs, Kubernetes, and cloud infrastructure (AWS, GCP)
- Building systems for human-in-the-loop machine learning
- Working with multimodal data (text, images, audio, etc.)
- Large-scale ETL and data processing frameworks (Spark, Airflow)
- Scaling reward model training to handle models with significantly more parameters while maintaining training stability
- Building a unified data pipeline that ingests human feedback from multiple sources and formats for reward model training
- Implementing fault-tolerant training infrastructure that gracefully handles hardware failures during long training runs
- Developing evaluation frameworks to measure reward model quality across diverse domains
- Optimizing training throughput to reduce iteration time on reward modeling experiments
Benefits
$350,000 - $500,000 USD
Application
View listing at origin and apply!