Write AI coding evaluation tasks, golden solutions, and automated verifiers for frontier AI labs. Remote contract, 30+ hrs/week, $35-120/hr (US/Canada $90-120/hr; Europe and LatAm scaled). 5+ years SWE required.
Code-Data Eval Author — Software Test Engineer / SDET (Pilot)
Job description
About the Role
Mercor partners with frontier AI labs to build the evaluations their coding models are trained and measured against. As a eval author / SDET, you'll define what "correct" means for the next generation of coding models.
What You'll Do
- Design verifiers and correctness rubrics for coding tasks
- Enumerate edge cases and build adversarial test cases for agent/model evaluation
- Grade agent trajectories and improve test/rubric quality through review
You Are
- ~5+ years as an SDET / software test engineer at a real product organization
- Write code and tests: automation frameworks (pytest, Playwright, Cypress), CI/CD
- Clear written communication; familiarity with AI tools / evals is a plus
Engagement & Pay
- Remote contract, flexible 30+ hrs/week
- Hourly rate set to your local market: US/Canada $75–100/hr; Europe and LatAm scaled to region
- Hiring process is paid — $200 for completing the Technical Screen, Code Review Session, and Domain Expert Interview
Contract Terms
- Independent contractor, weekly payments via Stripe or Wise
- Unable to support H1-B or STEM OPT candidates
You will be redirected to the company's website to complete your application.