Design graduate-level computational problems using PyMC, PyStan, FEniCS, FEniCSx, GUDHI, and related tools across Bayesian statistics, numerical PDEs, and computational topology — calibrating tasks against frontier AI models to build AI reasoning benchmarks.
Mathematics Specialist
Job description
About the Role
Turing is seeking a Mathematics Specialist (Mathematical Systems & Abstract Reasoning Engineer) to help evaluate and improve the reasoning capabilities of advanced AI systems.
In this role, you will create sophisticated mathematical reasoning datasets that challenge Large Language Models (LLMs) to interpret custom mathematical systems, apply formal rules and axioms, perform symbolic computations, derive properties, and construct logically sound proofs.
The position combines abstract mathematics, formal logic, problem design, and AI evaluation to help build more capable reasoning systems.
Key Responsibilities
Design Mathematical Systems
- Create novel mathematical systems and frameworks
- Define custom operations, formal rules, axioms, and symbolic structures
- Develop self-contained mathematical environments for reasoning evaluation
Create Advanced Reasoning Tasks
- Author multi-step challenges involving symbolic manipulation, expression evaluation, formal reasoning, proof construction, and property derivation
- Design tasks that assess both conceptual understanding and procedural accuracy
Develop Evaluation Materials
- Create deterministic solutions
- Write detailed reasoning explanations
- Build comprehensive scoring rubrics
- Establish objective evaluation criteria
Ensure Dataset Quality
- Identify and eliminate logical inconsistencies, ambiguities, and edge-case failures
- Improve robustness and reproducibility across evaluation datasets
Collaborate With AI Teams
- Work with reviewers and LLM researchers
- Refine task definitions and evaluation standards
- Support the development of high-quality reasoning benchmarks
Required Qualifications
- Strong foundation in Abstract Mathematics, Formal Logic, Algebraic Structures, Discrete Mathematics, and Theoretical Computer Science
- Minimum 2 years of experience in Mathematics, Computer Science, Data Science, Logic, or related analytical fields
- Proven ability to create structured reasoning and proof-based problems
- Excellent written communication and technical documentation skills
- Ability to write rigorous definitions, proofs, and reasoning processes
Preferred Qualifications
- Experience evaluating LLM reasoning performance
- Experience developing AI training datasets
- Experience building benchmark or assessment content for AI systems
Work Expectations
- Minimum overlap of 4 hours with PST time zone
- Fully remote work environment
- Contractor engagement
- Duration of approximately 8 weeks
About Turing
Turing is a leading AI research accelerator that supports frontier AI laboratories and global enterprises through advanced training data, research expertise, and AI system development.
You will be redirected to the company's website to complete your application.