Remote opportunity for experienced iOS engineers to evaluate AI-generated mobile applications, review architecture and implementation decisions, and contribute to frontier AI coding model benchmarking projects.
Backend Engineer (Coding Agent Experience)
Job description
About the Role
Mercor is partnering with a leading AI research lab to support a Frontier Code Agents project.
This role focuses on evaluating and improving frontier AI coding models through realistic software engineering workflows and structured technical assessments.
Rather than traditional feature development, contributors use professional engineering judgment to assess, compare, and improve the outputs of advanced AI coding agents.
Project capacity is limited and onboarding occurs on a first-come, first-served basis.
Key Responsibilities
Evaluate AI Coding Agents
- Use frontier AI coding agents to complete and assess complex engineering tasks
- Review AI-generated code for correctness, maintainability, performance, and code quality
Technical Review & Analysis
- Identify bugs, edge cases, failure modes, and architectural weaknesses
- Analyze tradeoffs in AI-generated implementations
- Apply backend engineering expertise to realistic production-style scenarios
Model Comparison
- Compare outputs from multiple frontier coding models
- Evaluate strengths and weaknesses across different solutions
- Provide structured assessments that improve future model performance
Ideal Candidate Background
Required Experience
- Minimum 2 years of professional backend engineering experience
- Experience building APIs, distributed systems, microservices, backend platforms, and databases
AI Coding Tools
Regular experience using AI-assisted development tools such as Cursor, Claude Code, Codex, Windsurf, or Gemini CLI.
Preferred Qualifications
- Experience working on large-scale production systems
- Experience reviewing complex codebases and distributed architectures
Time Commitment
- Sprint-based engagement
- Projects typically run in 12–24 hour work windows
Compensation
- $400 per accepted task (~$85/hr effective rate)
- Most tasks take approximately 2–3 hours after onboarding
About Mercor
Mercor partners with leading AI labs and enterprises to train and evaluate frontier AI systems using expert human knowledge.
You will be redirected to the company's website to complete your application.