Remote AI evaluation opportunity for bilingual Argentinian Spanish and English speakers to perform transcription, annotation, audio evaluation, rubric development, and language model benchmarking for leading AI research projects.
Dutch Audio Generalist Evaluator Expert
Job description
Mercor is seeking Dutch Audio Generalist Evaluator Experts to contribute to a high-impact AI research project focused on audio understanding, transcription, annotation, and model evaluation.
Contributors will help train and benchmark advanced language models by converting audio and video content into high-quality structured text and evaluating AI-generated responses against detailed quality standards.
This role is ideal for bilingual Dutch and English speakers with strong writing, analytical, and critical thinking skills.
Key Responsibilities
Transcribe and Optimize Audio & Video
- Create detailed transcriptions of audio and video content in Dutch
- Follow detailed project instructions, formatting standards, and quality requirements
- Produce supporting content in English when required
- Ensure accuracy, clarity, and consistency across deliverables
Define and Document Evaluation Standards
Establish expectations for high-quality responses in consumer audio contexts
Create detailed evaluation rubrics and grading guidelines
Document evaluation standards in:
- Dutch
- English
Maintain consistency across reviewers and evaluation tasks
Conduct Model Testing and Grading
Test language model outputs using structured prompts
Evaluate responses for:
- Accuracy
- Completeness
- Instruction following
- Clarity
- Overall quality
Apply predefined evaluation criteria consistently
Support Benchmarking and Quality Assurance
- Participate in quality assurance and review workflows
- Validate tasks and rubrics before benchmark integration
- Ensure consistency, reliability, and technical accuracy across project deliverables
Minimum Qualifications
Strong:
- Writing skills
- Critical thinking abilities
- Analytical reasoning
Fluency in:
- Dutch
- English
Ability to work independently
Strong time-management skills
Ability to meet deadlines
Availability during GMT or PST working hours
Preferred Qualifications
College student or graduate
Experience in:
- Transcription
- Annotation
- Evaluation workflows
- Research projects
Interest in:
- Artificial Intelligence
- Language Models
- Applied Research
Application Process
- Complete a short AI-led interview (approximately 15 minutes)
- Selected candidates will be invited to join the project
Additional Information
- Expected commitment: 3–6 months
- Fully remote work environment
- Flexible schedule
- Structured project workflow with clear goals and tooling
- Weekly payments through Stripe or Wise
About Mercor
Mercor partners with leading AI labs and enterprises to train and evaluate frontier AI systems using expert human knowledge.
Contributors help improve next-generation AI models through transcription, evaluation, benchmarking, and high-quality training data creation.
You will be redirected to the company's website to complete your application.