Back to remote jobs

Agentic Coding Annotator - Online / Offline Tasks

Turing

AI Expert - Software Engineering Contractor · Full-time Short-term
Remote (Global) Not specified May 6, 2026

Job description

About Turing

Turing is one of the world’s fastest-growing AI companies, accelerating the advancement and deployment of powerful AI systems.

Turing partners with leading AI labs to improve frontier models across reasoning, coding, agentic behavior, multimodality, multilinguality, STEM, and advanced software intelligence.

---

Role Overview

Turing is seeking experienced software practitioners to work as Agentic Coding Annotators supporting frontier AI coding model evaluation projects.

This role focuses on evaluating realistic software engineering workflows within agentic coding environments. Candidates will review model-generated coding trajectories, validate outputs, design coding tasks, and provide structured annotations and evaluations.

This is an advanced engineering evaluation role requiring strong debugging, validation, and software reasoning capabilities.

---

Key Responsibilities

Online Evaluations

  • Interact with blinded AI coding models on predefined software tasks
  • Evaluate and rank generated coding trajectories
  • Validate outputs through testing and debugging

Offline Evaluations

  • Design realistic multi-step coding tasks
  • Simulate user workflows and engineering objectives
  • Create detailed evaluation rubrics and grading criteria
  • Review and assess generated model outputs

General Responsibilities

  • Execute coding tasks within agentic coding harnesses
  • Run tests, commands, scripts, and debugging workflows
  • Inspect logs and generated artifacts
  • Perform manual and automated validation checks
  • Write evidence-based evaluation rationales
  • Maintain process consistency and schema compliance
  • Escalate broken environments or unclear workflows

---

Required Skills & Qualifications

  • 5+ years of experience in:
  • Software Engineering
  • QA Engineering
  • Developer Tooling
  • Data Engineering
  • ML Engineering
  • Similar code-heavy technical roles

Strong programming experience in at least 1–2 ecosystems:

  • Python
  • JavaScript / TypeScript
  • Rust
  • Java
  • C / C++
  • Bash / CLI environments
  • Haskell
  • Swift
  • SQL

Candidates must be able to:

  • Read unfamiliar codebases
  • Run and interpret tests/scripts
  • Debug complex issues
  • Evaluate implementation correctness
  • Analyze edge cases and partial fixes

---

Preferred Qualifications

  • Strong Docker experience
  • Experience working with large production repositories
  • Strong software architecture judgment
  • Experience designing realistic engineering tasks
  • Ability to create non-trivial coding workflows beyond tutorial-level exercises

---

Work Details

  • Fully remote contractor role
  • 8 hours/day commitment
  • Minimum 4-hour overlap with PST required
  • Contract duration: 5 weeks
  • Expected start date: next week

---

Why Join Turing

  • Work on frontier AI coding systems
  • Contribute to advanced LLM evaluation pipelines
  • Collaborate with global engineering teams
  • Flexible remote-first environment
  • High-impact work improving agentic coding models
Apply now

You will be redirected to the company's website to complete your application.

Apply now

Stay in the loop.

One email per week, 5 hand-picked roles.