Verify proofs, check working, and catch the answers that look right but aren't — reviewing AI-generated mathematics at the STEM tier.
Tier: STEM
Mathematics subcontractors review the calculus, algebra, statistics, and proofs that AI models produce. Models often get the final answer right while the working is fabricated — and equally often, the presentation looks airtight while the answer is off. Both matter, and we test for the disposition that catches both.
Mathematics review is harder than it looks because models often get the answer right while the working is fabricated. A correct answer with bad working is still a failed output — because it can't be trusted at scale. Your job is to find where the reasoning broke, not just whether the final number matches.
The Cognition Test for Mathematics draws from a 150-question bank covering algebra, calculus, statistics, probability, discrete mathematics, and proof verification.
10 easy / 20 medium / 10 hard in the domain section. Behavioural section unweighted by difficulty.
Verifying a model-generated proof step-by-step; spotting algebraic slips three steps back; catching a statistics answer that uses the wrong distribution; identifying a calculus solution where the differentiation is right but the integration is wrong.
Working-checking over answer-checking. Anyone with a calculator can verify a final number. We test for the disposition that traces the reasoning and finds the break.
Qualified Mathematics subcontractors are matched to live AI review work within seven days. Rate ceiling is up to $40/hr at the STEM tier.
What you need to know about this specialty before you sit the test.