Design verifiers, correctness rubrics, and adversarial test cases that decide whether AI-generated code actually works. Remote contract, 30+ hrs/week, $30-100/hr (US/Canada $75-100/hr). 5+ years SDET required.
Code-Data Eval Author — Software Engineer (Pilot)
Job description
About the Role
Mercor partners with frontier AI labs to build the evaluations their coding models are trained and measured against. As a eval author / software engineer, you'll define what "correct" means for the next generation of coding models.
What You'll Do
- Author non-trivial coding tasks with golden solutions and automated verifiers
- Design rubrics and grade agent trajectories / model outputs
- Improve task and rubric quality through structured review
You Are
- ~5+ years of software engineering at a real product organization (big tech or venture-backed startup)
- Strong code quality, systems design, debugging, and testing discipline
- Clear written communication — you write instructions others follow
- Familiarity with AI coding tools / evals is a plus, not a requirement
Engagement & Pay
- Remote contract, flexible 30+ hrs/week
- Hourly rate set to your local market: US/Canada $90–120/hr; Europe and LatAm scaled to region
- Hiring process is paid — $200 for completing the Technical Screen, Code Review Session, and Domain Expert Interview
Contract Terms
- Independent contractor, weekly payments via Stripe or Wise
- Unable to support H1-B or STEM OPT candidates
You will be redirected to the company's website to complete your application.