Why This Role Exists
Mercor partners with leading AI teams to improve the quality, usefulness, and reliability of general-purpose conversational AI systems.
This project focuses on evaluating and improving general chat behavior in large language models (LLMs). You will assess AI-generated responses across diverse topics and provide structured human feedback to ensure outputs are accurate, well-reasoned, and aligned with human expectations.
What You’ll Do
- Evaluate LLM-generated responses for clarity, correctness, and completeness.
- Conduct fact-checking using trusted public sources and verification tools.
- Annotate strengths, weaknesses, and factual inaccuracies.
- Assess reasoning quality, tone, and conversational alignment.
- Ensure outputs comply with system guidelines and expected behavior.
- Apply consistent annotations using structured taxonomies and evaluation rubrics.
Who You Are
- Bachelor’s degree holder.
- Native Hindi speaker or ILR 5 / C2 proficiency.
- Fluent in English.
- Experienced user of large language models (LLMs).
- Strong writing skills with ability to provide nuanced feedback.
- Highly detail-oriented and analytical.
- Comfortable working across diverse topics and domains.
- Strong college-level mathematics skills.
Nice-to-Have
- Experience with RLHF, model evaluation, or annotation workflows.
- Experience comparing multiple outputs and making fine-grained qualitative judgments.
- Familiarity with evaluation rubrics and benchmarking systems.
- Background in research, analytics, linguistics, or engineering.
What Success Looks Like
- You consistently identify factual inaccuracies and reasoning gaps.
- Your evaluation artifacts are clear, consistent, and reproducible.
- Your feedback leads to measurable improvements in AI response quality.
- AI systems improve before public deployment due to your evaluations.
Contract & Payment
- Independent contractor engagement.
- Fully remote with flexible schedule.
- Weekly payments via Stripe or Wise.
- Geography restricted to India and USA.
- $12.19 per hour.
About Mercor
Mercor partners with leading AI labs and enterprises to train frontier models using human expertise. Contributors collaborate with researchers to improve advanced AI systems used globally.