Mercor seeks bilingual AI Safety Experts fluent in English and Marathi to perform red teaming and adversarial testing on AI systems, identifying vulnerabilities and generating safety data to improve AI robustness. This remote contract role involves structured evaluation, documentation, and collaboration with leading AI researchers.
AI Safety Experts — English & Malayalam
Job description
Mercor is seeking bilingual AI Safety Experts fluent in both English and Malayalam to help evaluate and strengthen the safety of frontier AI systems.
This role focuses on red teaming AI models by identifying vulnerabilities, testing misuse scenarios, and generating high-quality safety data that helps improve model robustness and reliability.
The work involves evaluating AI outputs related to sensitive topics such as:
- Bias
- Misinformation
- Harmful behaviors
- Safety vulnerabilities
All work is text-based. Participation in higher-sensitivity projects is optional and supported by clear guidelines and wellness resources.
What You'll Do
Red Team AI Systems
- Test conversational AI models and agents
- Explore:
- Jailbreak techniques
- Prompt injection attacks
- Misuse scenarios
- Bias exploitation
- Multi-turn manipulation strategies
Generate Human Safety Data
- Annotate model failures
- Classify vulnerabilities
- Identify systemic risks
- Create high-quality safety datasets
Apply Structured Evaluation
Follow established:
- Taxonomies
- Benchmarks
- Testing playbooks
Ensure evaluations remain consistent and reproducible
Document Findings
Produce:
- Reports
- Datasets
- Attack cases
- Safety evaluations
Create artifacts that can be used to improve AI systems
Who They're Looking For
Prior red teaming experience in areas such as:
- AI adversarial testing
- Cybersecurity
- Socio-technical risk analysis
Strong curiosity and adversarial thinking
Ability to systematically test systems rather than relying on random approaches
Strong communication skills for both technical and non-technical audiences
Ability to adapt across multiple projects and customer environments
Nice-to-Have Specialties
Adversarial Machine Learning
- Jailbreak datasets
- Prompt injection attacks
- RLHF/DPO attack methodologies
- Model extraction techniques
Cybersecurity
- Penetration testing
- Exploit development
- Reverse engineering
Socio-Technical Risk
- Harassment testing
- Misinformation analysis
- Abuse detection
- Conversational AI evaluation
Creative Adversarial Thinking
- Psychology
- Acting
- Writing
- Behavioral analysis
What Success Looks Like
- Discover vulnerabilities that automated testing misses
- Deliver reproducible safety artifacts
- Expand evaluation coverage across more scenarios
- Improve customer confidence in deployed AI systems
Why Join Mercor
- Gain experience in frontier AI safety and red teaming
- Work directly on improving AI robustness and trustworthiness
- Collaborate with leading AI researchers and organizations
Contract & Payment Terms
- Independent contractor engagement
- Fully remote work
- Flexible schedule
- Weekly payments through Stripe or Wise
- Projects may be extended, shortened, or concluded based on business needs and performance
Important Note
Mercor currently cannot support:
- H1-B candidates
- STEM OPT candidates
About Mercor
Mercor partners with leading AI labs and enterprises to train and evaluate frontier AI systems using human expertise.
Contributors work on projects that help shape the next generation of safe and reliable AI technologies.
You will be redirected to the company's website to complete your application.