Remote research internship opportunity focused on multimodal LLM benchmarking, AI evaluation, dataset curation, and multimodal foundation model analysis across text, image, audio, and video systems.
Rubrics Evaluator (Professional Experience)
Job description
Mercor is seeking experienced enterprise professionals to contribute to an AI evaluation and benchmarking project in partnership with a leading AI lab.
Contributors will evaluate the quality of AI-generated reasoning, decision-making, and analytical outputs across complex enterprise and operational scenarios.
This opportunity is ideal for professionals with strong analytical judgment and hands-on experience working in business, consulting, finance, operations, engineering, legal, or strategic environments.
About Mercor
Mercor partners with leading AI labs and enterprises to train and evaluate frontier AI systems using real-world human expertise.
Projects focus on:
- AI evaluation
- Benchmarking
- Human reasoning assessment
- Enterprise workflow simulation
- AI quality assurance
Key Responsibilities
Evaluate AI-generated outputs using:
- Structured rubrics
- Scoring frameworks
- Evaluation criteria
Assess:
- Qualitative reasoning
- Quantitative reasoning
- Decision-making quality
across enterprise-focused scenarios.
- Review:
- Business analyses
- Operational workflows
- Legal reasoning
- Financial analysis
- Strategic recommendations
for:
Accuracy
Clarity
Logic
Completeness
Identify:
- Weak reasoning
- Inconsistencies
- Missing context
- Analytical gaps
in AI-generated responses.
- Provide:
- Written justification
- Scoring rationale
- Structured evaluation feedback
to support assessment decisions.
Ideal Qualifications
Professional degree in areas such as:
- Business
- Finance
- Law
- Engineering
- Operations
- Related enterprise disciplines
4+ years of professional enterprise experience
Background in:
- Consulting
- Finance
- Legal operations
- Manufacturing
- Engineering
- Corporate strategy
- Business operations
Strong:
- Quantitative reasoning
- Analytical thinking
- Written communication skills
Ability to independently evaluate:
- Complex operational scenarios
- Strategic reasoning
- Business decision-making
Strong attention to detail
Application Process
- Update your Mercor profile with your latest professional experience
- Selected applicants may complete a short evaluation exercise
- Qualified contributors will receive additional project details and onboarding instructions
Contract & Payment Terms
Independent contractor role
Fully remote work environment
Flexible scheduling
Weekly payments through:
- Stripe
- Wise
Projects may be:
- Extended
- Shortened
- Ended early
based on project needs and contributor performance.
Additional Information
- H1-B and STEM OPT candidates are not currently supported
- Contributors may work on AI systems used by leading frontier AI labs and research organizations
You will be redirected to the company's website to complete your application.