Remote opportunity for Dutch and English bilingual professionals to perform transcription, annotation, audio evaluation, rubric development, and AI model benchmarking for leading AI research projects.
AI Quality Analyst (Personalization) - Korean
Job description
About the Role
Turing is seeking Korean-speaking AI Quality Analysts to evaluate a new personalization feature for Gemini. In this role, you will assess how effectively the AI uses information from a user's Gemini conversations, Gmail, Google Search history, and YouTube activity to deliver more relevant, personalized, and helpful responses.
The position combines analytical thinking, prompt design, content evaluation, and AI quality assessment. You will create realistic conversational scenarios based on personal experiences, compare model outputs, identify personalization issues, and provide detailed feedback that helps improve the next generation of AI systems.
Key Responsibilities
Design Personalized Evaluation Scenarios
- Create multi-turn conversational prompts (typically 1–5 turns)
- Develop realistic scenarios based on personal experiences and context
- Test how effectively AI systems use personal information to improve responses
Evaluate AI Responses
- Assess responses for relevance, helpfulness, personalization quality, natural language quality, and user experience
- Verify whether personal information was used appropriately
Assess Grounding & Accuracy
- Identify unsupported assumptions, incorrect personalization, hallucinations, weak inferences, and misuse of personal data
- Ensure claims about the user are supported by available evidence
Evaluate Integration Quality
- Determine whether personalization feels natural
- Identify robotic or excessive personalization
- Review how personal context is incorporated into responses
Compare Model Outputs
- Evaluate side-by-side (SxS) model responses
- Rank responses based on helpfulness, accuracy, ease of use, and overall quality
- Provide detailed justification for rankings
Documentation & Quality Assurance
- Write clear and structured evaluation rationales
- Reference specific conversation turns when documenting findings
- Verify model debug information and data usage
- Maintain evaluation data hygiene by removing testing conversations when required
Required Qualifications
Language Requirements
- Native or near-native Korean proficiency
- Strong Korean reading and writing skills
- Ability to communicate clearly in English
Personalization Evaluation Requirements
- Willingness to use a primary personal Google account
- Comfortable enabling personal Google data sources for evaluation purposes
Professional Skills
- Strong analytical and critical thinking abilities
- Ability to evaluate nuanced and ambiguous AI responses
- Experience designing creative prompts and test cases
- Strong attention to detail and excellent written communication skills
- Ability to work independently in a remote environment
Preferred Qualifications
- Bachelor's degree or equivalent experience in Policy, Law, Ethics, Linguistics, Journalism, Computer Science, or related analytical disciplines
- Experience in data annotation, AI quality evaluation, content moderation, model assessment, or quality assurance
Compensation
- $15 per hour
About Turing
Turing is a leading AI research accelerator that works with frontier AI laboratories and global enterprises to develop advanced AI systems, training data, evaluation frameworks, and reinforcement learning environments.
You will be redirected to the company's website to complete your application.