May 9, 2026
/
Mathematics

Verify the proof, not just the answer.

A model produces a confident-looking calculus solution with the wrong final answer. Your job is to find where the reasoning broke. Often the answer is right but the working is fabricated. Both matter.

Verify the proof, not just the answer.

The example

A model produces a confident-looking solution to a calculus problem. The presentation is clean. The final answer is wrong.

Your job is to find where the reasoning broke. The break is rarely obvious — often it's an algebraic slip three steps back, or a dropped sign in a substitution.

Why this is the work

Mathematics review is harder than it looks because models often get the answer right while the working is fabricated. Both matter. A correct answer with bad working is still a failed output — because it can't be trusted at scale.

What qualifying looks like

Pass the Mathematics Cognition Test (60 questions, 78% pass mark) and you're qualified for AI subcontractor work at the STEM tier — up to $40/hr.

Jobpeak Editorial

Jobpeak Editorial

Platform team

A model produces a confident-looking solution to a calculus problem with the wrong final answer. Your job is to find where the reasoning broke. Often the model gets the answer right but the working is fabricated. Both matter.

Newsletter

Subscribe for cutting-edge AI updates

Lorem ipsum dolor sit amet consectetur at amet felis nulla molestie non viverra diam sed augue gravida ante risus pulvinar diam turpis ut bibendum ut velit felis at nisl lectus.

Thanks for subscribing to our newsletter!
Oops! Something went wrong while submitting the form.
Only one email per month — No spam!

Explore our collection of 200+ Premium Webflow Templates

Need to customize this template? Hire our Webflow team!

Jobpeak Support

Ask me anything