Remote contract opportunity for experienced enterprise professionals to evaluate AI-generated reasoning, business analysis, and operational decision-making using structured scoring rubrics and evaluation frameworks.
Personalized Life Assistant Expert
Job description
About the Role
A leading AI research organization is seeking experienced AI power users to evaluate how effectively advanced language models perform real-world, highly personalized life-assistance tasks. Contributors will assess AI outputs across domains such as productivity, career planning, health, learning, and food recommendations, helping improve the next generation of AI personal assistants.
This role is ideal for individuals who frequently use AI systems in their daily lives and can critically evaluate whether responses are practical, personalized, context-aware, and genuinely useful.
Pay: $50–$200/hr | Commitment: 15–40 hours/week | Paid Work Trial: $30
Key Responsibilities
AI Output Evaluation
- Assess AI-generated responses for usefulness, accuracy, personalization, and practicality
- Determine whether outputs successfully address real-world personal scenarios
- Identify strengths, weaknesses, and missing considerations in model responses
- Evaluate reasoning quality and contextual awareness
Rubric-Based Assessment
- Apply detailed evaluation rubrics to score AI performance
- Ensure consistency and objectivity across evaluations
- Contribute to quality assurance and benchmarking efforts
Prompt Development
- Create realistic prompts representing complex personal-life scenarios
- Design tasks involving multiple constraints, preferences, and tradeoffs
Failure Analysis
Identify situations where AI misses critical context, produces unrealistic recommendations, gives incomplete advice, overreaches beyond available information, or provides unsafe guidance.
Personal Assistant Benchmarking Domains
Food & Dining — Restaurant recommendations, menu comparisons, dietary restriction accommodation, budget-based recommendations
Health & Wellness — Health trend analysis, wearable data interpretation, sleep analysis, lifestyle recommendations, appropriate escalation to medical professionals
Productivity — Personal project management, task organization, calendar planning, personal CRM management
Career Development — Job searching, resume improvement, LinkedIn optimization, networking strategies, interview preparation
Learning & Growth — Study plans, skill development roadmaps, accountability systems, personal development planning
Required Qualifications
AI Experience
- Heavy personal use of LLM products (ChatGPT, Claude, Gemini, Perplexity, Cursor, Windsurf, Codex, or similar)
- Significant experience using AI for planning, research, decision-making, personal productivity, and multi-step workflows
Evaluation Skills
- Strong written communication and critical thinking
- Attention to detail and ability to explain evaluation decisions clearly
- 100+ hours of rubric evaluation, quality assessment, or evaluation workflow experience
Preferred Qualifications
- Domain expertise in personal finance, career development, health and wellness, education, productivity systems, or food and dining
- LLM evaluation, AI training, benchmark creation, or human feedback project experience
- Desktop or laptop computer required
Engagement Details
- Contract Type: Independent Contractor
- Location: Remote (United States)
- Commitment: 15–40 hours/week
- Task Turnaround: Within 24 hours
- Work Trial: Paid ($30)
About Mercor
Mercor partners with leading AI labs and enterprises to train frontier AI systems using human expertise. Contributors work on high-impact projects that help shape the future of AI assistants and intelligent decision-support systems.
You will be redirected to the company's website to complete your application.