Mark Zuckerberg co-founded Facebook in 2004 with Dustin Moskovitz, Eduardo Saverin, Andrew McCollum, and Chris Hughes. The company rebranded to Meta in October 2021.

Is Meta hiring after the 2026 layoffs?

Yes — even after the ~8,000-role May 2026 cut, Meta continues to hire aggressively in AI research, ML infrastructure, and specialized engineering / product roles tied to the Superintelligence Labs build-out.

What is Reality Labs?

Meta's AR/VR/wearables division — Quest VR headsets, Ray-Ban Meta AI glasses (7M units sold in 2025), and Horizon Worlds. Q1 2026 revenue: $402M; operating loss: $4.03B.

How big is Meta's 2026 AI capex?

$125-145B raised guidance for 2026, primarily into AI infrastructure, model development, and Superintelligence Labs.

Is Meta still remote-friendly?

Largely not. Most roles require 3+ days/week in office near a hub since 2024. Select fully-remote roles exist in research, infosec, and integrity.

Browse open roles at metacareers.com.

Back to remote jobs

Data Labeling Analyst IV

Job description

Role Overview

We are aiming to hire cultural analysts and quality experts to evaluate and improve our auto-judge AI models that assess search quality for media content across US and Canadian markets. This is not an entry-level position—we need people who can think like researchers, write like analysts, and understand internet culture like digital anthropologists.
This role directly impacts model architecture decisions. Your evaluations will be read by ML engineers, product managers, and senior leadership. Incomplete analysis or surface-level observations will not meet our quality bar.

Core Responsibilities

Expert Reviews
Evaluate AI model outputs across subjective quality dimensions including relevance, tone, and intent with >98% inter-rater reliability with product experts
Assess search prompt suggestions rendered on user-watched media content
Validate search result relevance for various query types
Identify and categorize model failure patterns with technical precision
Compile actionable reports that bridge the gap between data observations and engineering solutions
Analysis & Reporting
Document nuanced cultural context with specific examples and pattern evidence
Create structured feedback that traces failure modes to root causes
Categorize errors systematically, providing statistical distribution of failure types
Deliver qualitative insights backed by quantitative metrics
Write reports that require minimal clarification—your analysis should stand alone

Required Qualifications

Cultural Fluency
Expert-level understanding of Gen-Z and Millennial communication patterns, slang evolution, and memetic spread
Proven ability to detect humor, satire, irony, and tonal nuances across diverse content types and demographics
Comprehensive knowledge of locale-specific terms, cultural events, and regional variations across North America
Comfort consuming and analyzing trending social media content and comments
Track record of staying ahead of trends, not just following them

Analytical Skills
Experience conducting qualitative research or content analysis (academic, professional, or equivalent)
Advanced pattern recognition abilities—you see what others miss and can prove it systematically
Ability to build taxonomies and categorization systems from unstructured observations
Statistical thinking—you quantify patterns, not just describe them
Root cause analysis expertise—you don't just identify problems, you diagnose why they exist
Exceptional written communication—your reports require no follow-up questions

Technical Aptitude
Comfort with data platforms and complex evaluation tools (training provided, but you must be tech-savvy)
Conceptual understanding of how ML models work—you can reason about what models can/can't learn
Experience with QA, user research, content analysis, or similar structured evaluation work
Ability to write for technical audiences—engineers should be able to implement your recommendations

PST Preferred

Understanding Model Failure Modes
What are failure modes? These are consistent, identifiable patterns in how AI models make incorrect decisions. Your job is to spot these patterns and help us understand why they happen.
Common Failure Mode Categories You'll Identify:
1. Cultural Context Blindness
Example: Model rates "it's giving main character energy" as irrelevant to a video about confident behavior because it doesn't recognize Gen-Z slang
What you'll do: Flag when models miss cultural idioms, slang, or contextual phrases that change meaning
2. Sarcasm & Tone Misinterpretation
Example: Model suggests "helpful tutorial" for a obviously satirical "how-to" video mocking the format
What you'll do: Identify when models take ironic, sarcastic, or humorous content at face value
3. Comment Section Context Ignorance
Example: Model misses that top comments reveal video is about a different topic than the title suggests (clickbait, inside jokes)
What you'll do: Document cases where comments provide critical context the model overlooks
4. Temporal/Trending Reference Failures
Example: Model doesn't connect "Roman Empire" searches to the viral TikTok trend about things men think about constantly
What you'll do: Catch when models miss current events, memes, or trending cultural moments
5. Regional/Locale-Specific Misses
Example: Model doesn't understand "double-double" in Canadian context or regional slang like "wicked" (New England) vs standard usage
What you'll do: Identify geographical or regional context the model lacks
6. Intent Mismatch
Example: User searches "Ariana Grande" on a beauty video, model doesn't recognize they're looking for makeup tutorials inspired by her style
What you'll do: Flag when models miss the why behind search behavior
7. Multi-Modal Understanding Gaps
Example: Model evaluates relevance based only on text, missing that visual content or audio contradicts or adds crucial context
What you'll do: Identify when models need to "watch" not just "read"

Model Failure Modes Categorization Framework
For each failure you identify, you'll document:
WHAT Failed
Which quality dimension? (relevance, tone, intent)
What did the model get wrong?
WHY It Failed
What context/knowledge was missing?
Which failure mode category does this fit?
HOW OFTEN It Happens
Is this a one-off edge case or systematic pattern?
Does it affect specific content types/demographics?
IMPACT Level
Minor: Slightly off but understandable
Moderate: Clearly wrong, hurts user experience
Severe: Completely misses the mark, could be offensive/harmful

Example Scenarios:
Scenario 1: Complex Multi-Factor Analysis
Model fails on parody content in 40% of cases, succeeds in 60%
Expected output:
Build a taxonomy of parody types (deadpan, exaggerated, format parody, etc.)
Identify which types model handles well vs. poorly
Analyze linguistic/visual markers of each type
Determine decision boundary where model fails
Map to specific model limitations
Recommend graduated improvement approach
Format: 5-page analysis with data visualizations
Scenario 2: Trend Analysis
New slang term emerging, model doesn't recognize it yet
Expected output:
Track term origin and spread (platforms, demographics, geography)
Predict longevity (flash-in-pan vs. lasting)
Assess urgency of model update
Document semantic meaning and usage contexts
Identify related terms model should learn
Format: 1-page rapid brief (time-sensitive)

PST Preferred

This posting is for a contract assignment with Tundra Technical Solutions to provide services to Meta.

The pay range that Tundra in good faith reasonably expects to pay for this position is $21.77/hour - $30.06/hour.

Tundra’s benefits offering includes optional medical, dental, vision, retirement benefits, up to 15 Days PTO per annum and a New Child Benefit.

Pursuant to the California Fair Chance Act, Los Angeles County Fair Chance Ordinance for Employers, Los Angeles Fair Chance Initiative for Hiring Ordinance, and San Francisco Fair Chance Ordinance, qualified applicants will be considered for assignment with arrest and conviction records. Criminal history may have a direct, adverse, and negative relationship with some of the material job duties of this position. These include the duties and responsibilities listed above, as well as the abilities to adhere to company policies, exercise sound judgment, effectively manage stress and work safely and respectfully with others, exhibit trustworthiness, meet client expectations, standards, and accompanying requirements, and safeguard business operations and company reputation.

Tundra Technical Solutions is among North America’s leading providers of Staffing and Consulting Services. Our success and our clients’ success are built on a foundation of service excellence. We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic. Qualified applicants with arrest or conviction records will be considered for employment in accordance with applicable law, including the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. Unincorporated LA County workers: we reasonably believe that criminal history may have a direct, adverse and negative relationship with the following job duties, potentially resulting in the withdrawal of a conditional offer of employment: client provided property, including hardware (both of which may include data) entrusted to you from theft, loss or damage; return all portable client computer hardware in your possession (including the data contained therein) upon completion of the assignment, and; maintain the confidentiality of client proprietary, confidential, or non-public information. In addition, job duties require access to secure and protected client information technology systems and related data security obligations.

The Meta CWX Program is enabled by a cutting-edge software platform called TalentNet that leads the contingent labor world for technology innovation. The software platform leverages Machine Learning and Artificial Intelligence to make sure the right people end up in the right job.

At Meta, we are constantly iterating, solving problems, and working together to connect people all over the world. That’s why it’s important that our workforce reflects the diversity of the people we serve. Hiring people with different backgrounds and points of view helps us make better decisions, build better products, and create better experiences for everyone.

We give people the power to build community and bring the world closer together. Our products empower more than 3 billion people around the world to share ideas, offer support, and make a difference.

This posting is for a contract assignment with Tundra Technical Solutions to provide services to Meta. Please note that this is not a full-time employment opportunity. Candidates selected for this role will be engaged as contractors for the specified duration of the project. For any inquiries regarding the terms of the contract or engagement, please contact Tundra Technical Solutions directly.

Apply now

You will be redirected to the company's website to complete your application.

Apply now

Please let Meta know you found this job on RemoWork. This helps us grow!

Job summary

Company: Meta
Location: United States
Work Type: Contractor
Posted on: May 20, 2026
Employment: Contractor
Language: English
Work authorization: Required
Authorized countries: United States
State / region restrictions: US: California
Category: AI & Data Training

Skills

AI Culture Architecture Machine Learning Leadership Analysis Precision Engineering Reporting ROOT Communication Slang Research Training Quality Assurance

Job description

Categories

Related jobs

Search Engine Evaluation Specialist - Freelance AI Trainer Project

AI Training Generalist (No Prior Experience Needed) - Freelance AI Trainer Project

Catalan Audio Transcription Verifier

Flemish Audio Transcription Verifier

Frequently asked questions