Back to remote jobs

Code-Data Eval Author — Machine Learning Engineer (Pilot)

Mercor

Machine Learning Engineer Contractor
Argentina, Austria, Belgium, Brazil, Canada, Chile, Colombia, Czech Republic, Denmark, Finland, France, Germany, Ireland, Italy, Mexico, Netherlands, Norway, Peru, Poland, Portugal, Romania, Spain, Sweden, Switzerland, United Kingdom, United States, Uruguay $45 – $140/hr June 9, 2026

Job description

About the Role

Mercor partners with frontier AI labs to build the evaluations their coding models are trained and measured against. As a eval author / ML engineer, you'll define what "correct" means for the next generation of coding models.

What You'll Do

  • Design ML/LLM evaluation tasks, rubrics, and metrics
  • Grade model/agent outputs and improve eval quality through review
  • Bring training-side judgment (SFT / RLHF / reward modeling) to eval design

You Are

  • ~5+ years as an MLE at a real product organization with hands-on training/fine-tuning and evals
  • Ideally fluent in SFT / RLHF / reward modeling / eval metrics
  • PyTorch/JAX, Hugging Face, experiment tracking; clear written communication

Engagement & Pay

  • Remote contract, flexible 30+ hrs/week
  • Hourly rate set to your local market: US/Canada $100–140/hr; Europe and LatAm scaled to region
  • Hiring process is paid — $200 for completing the Technical Screen, Code Review Session, and Domain Expert Interview

Contract Terms

  • Independent contractor, weekly payments via Stripe or Wise
  • Unable to support H1-B or STEM OPT candidates
Apply now

You will be redirected to the company's website to complete your application.

Apply now

Stay in the loop.

One email per week, 5 hand-picked roles.