Back to remote jobs

Personalized Life Assistant Expert

Mercor

LLM & Agent Evaluation Contractor Ongoing
United States $50 – $200/hr June 15, 2026

Job description

About the Role

A leading AI research organization is seeking experienced AI power users to evaluate how effectively advanced language models perform real-world, highly personalized life-assistance tasks. Contributors will assess AI outputs across domains such as productivity, career planning, health, learning, and food recommendations, helping improve the next generation of AI personal assistants.

This role is ideal for individuals who frequently use AI systems in their daily lives and can critically evaluate whether responses are practical, personalized, context-aware, and genuinely useful.

Pay: $50–$200/hr | Commitment: 15–40 hours/week | Paid Work Trial: $30

Key Responsibilities

AI Output Evaluation

  • Assess AI-generated responses for usefulness, accuracy, personalization, and practicality
  • Determine whether outputs successfully address real-world personal scenarios
  • Identify strengths, weaknesses, and missing considerations in model responses
  • Evaluate reasoning quality and contextual awareness

Rubric-Based Assessment

  • Apply detailed evaluation rubrics to score AI performance
  • Ensure consistency and objectivity across evaluations
  • Contribute to quality assurance and benchmarking efforts

Prompt Development

  • Create realistic prompts representing complex personal-life scenarios
  • Design tasks involving multiple constraints, preferences, and tradeoffs

Failure Analysis

Identify situations where AI misses critical context, produces unrealistic recommendations, gives incomplete advice, overreaches beyond available information, or provides unsafe guidance.

Personal Assistant Benchmarking Domains

Food & Dining — Restaurant recommendations, menu comparisons, dietary restriction accommodation, budget-based recommendations

Health & Wellness — Health trend analysis, wearable data interpretation, sleep analysis, lifestyle recommendations, appropriate escalation to medical professionals

Productivity — Personal project management, task organization, calendar planning, personal CRM management

Career Development — Job searching, resume improvement, LinkedIn optimization, networking strategies, interview preparation

Learning & Growth — Study plans, skill development roadmaps, accountability systems, personal development planning

Required Qualifications

AI Experience

  • Heavy personal use of LLM products (ChatGPT, Claude, Gemini, Perplexity, Cursor, Windsurf, Codex, or similar)
  • Significant experience using AI for planning, research, decision-making, personal productivity, and multi-step workflows

Evaluation Skills

  • Strong written communication and critical thinking
  • Attention to detail and ability to explain evaluation decisions clearly
  • 100+ hours of rubric evaluation, quality assessment, or evaluation workflow experience

Preferred Qualifications

  • Domain expertise in personal finance, career development, health and wellness, education, productivity systems, or food and dining
  • LLM evaluation, AI training, benchmark creation, or human feedback project experience
  • Desktop or laptop computer required

Engagement Details

  • Contract Type: Independent Contractor
  • Location: Remote (United States)
  • Commitment: 15–40 hours/week
  • Task Turnaround: Within 24 hours
  • Work Trial: Paid ($30)

About Mercor

Mercor partners with leading AI labs and enterprises to train frontier AI systems using human expertise. Contributors work on high-impact projects that help shape the future of AI assistants and intelligent decision-support systems.

Apply now

You will be redirected to the company's website to complete your application.

Apply now

Stay in the loop.

One email per week, 5 hand-picked roles.