Back to remote jobs

AI Quality Analyst (Personalization) - Korean

Turing

Bilingual LLM Evaluator Contractor
Remote (Global) $15 – $15/hr June 14, 2026

Job description

About the Role

Turing is seeking Korean-speaking AI Quality Analysts to evaluate a new personalization feature for Gemini. In this role, you will assess how effectively the AI uses information from a user's Gemini conversations, Gmail, Google Search history, and YouTube activity to deliver more relevant, personalized, and helpful responses.

The position combines analytical thinking, prompt design, content evaluation, and AI quality assessment. You will create realistic conversational scenarios based on personal experiences, compare model outputs, identify personalization issues, and provide detailed feedback that helps improve the next generation of AI systems.

Key Responsibilities

Design Personalized Evaluation Scenarios

  • Create multi-turn conversational prompts (typically 1–5 turns)
  • Develop realistic scenarios based on personal experiences and context
  • Test how effectively AI systems use personal information to improve responses

Evaluate AI Responses

  • Assess responses for relevance, helpfulness, personalization quality, natural language quality, and user experience
  • Verify whether personal information was used appropriately

Assess Grounding & Accuracy

  • Identify unsupported assumptions, incorrect personalization, hallucinations, weak inferences, and misuse of personal data
  • Ensure claims about the user are supported by available evidence

Evaluate Integration Quality

  • Determine whether personalization feels natural
  • Identify robotic or excessive personalization
  • Review how personal context is incorporated into responses

Compare Model Outputs

  • Evaluate side-by-side (SxS) model responses
  • Rank responses based on helpfulness, accuracy, ease of use, and overall quality
  • Provide detailed justification for rankings

Documentation & Quality Assurance

  • Write clear and structured evaluation rationales
  • Reference specific conversation turns when documenting findings
  • Verify model debug information and data usage
  • Maintain evaluation data hygiene by removing testing conversations when required

Required Qualifications

Language Requirements

  • Native or near-native Korean proficiency
  • Strong Korean reading and writing skills
  • Ability to communicate clearly in English

Personalization Evaluation Requirements

  • Willingness to use a primary personal Google account
  • Comfortable enabling personal Google data sources for evaluation purposes

Professional Skills

  • Strong analytical and critical thinking abilities
  • Ability to evaluate nuanced and ambiguous AI responses
  • Experience designing creative prompts and test cases
  • Strong attention to detail and excellent written communication skills
  • Ability to work independently in a remote environment

Preferred Qualifications

  • Bachelor's degree or equivalent experience in Policy, Law, Ethics, Linguistics, Journalism, Computer Science, or related analytical disciplines
  • Experience in data annotation, AI quality evaluation, content moderation, model assessment, or quality assurance

Compensation

  • $15 per hour

About Turing

Turing is a leading AI research accelerator that works with frontier AI laboratories and global enterprises to develop advanced AI systems, training data, evaluation frameworks, and reinforcement learning environments.

Apply now

You will be redirected to the company's website to complete your application.

Apply now

Stay in the loop.

One email per week, 5 hand-picked roles.