Before sharing any Cowork output, classify it. This takes 3 seconds and prevents most of the mistakes people make with AI-generated content.
4 objectives
- Identify the 5 AI failure modes by experiencing each one firsthand
- Classify any task as Green, Yellow, or Red in under 10 seconds
- Build a personal verification checklist you'll use for every future output
- Know exactly what data can and cannot go into Cowork
Recognizing Bad Output — Spot hallucinations, broken formulas, and misclassified data

Game 1: Spot the Hallucination
WEEKLY CHARGEBACK SUMMARY — Week of March 16-22, 2026
Merchant: TechFlow Payments Ltd | Merchant ID: MID-4847392
Period: 7 days | Total Transactions: £2,847,500 | Total Chargebacks: 18
CHARGEBACK BREAKDOWN BY REASON CODE
| Code | Description | Count | Amount |
|---|---|---|---|
| 10.4 | Fraud — Card Present | 7 | £8,400 |
| 10.3 | Cancelled by Merchant | 3 | £2,100 |
| 13.5 | Processing Error — Not Captured | 4 | £3,200 |
| 13.1 | Duplicate Processing | 2 | £1,800 |
| 4829 | Counterfeit Card (Card Not Present) | 2 | £2,100 |
TOTALS
Total Chargebacks by Count: 18 ✓
Total Chargebacks by Amount: £17,700
Chargeback Ratio: 0.62% (Below Visa threshold of 0.9%)
MERCHANT ASSESSMENT
Status: Safe — Ratio is below scheme thresholds. No escalation required.
Recommended Action: Continue monitoring; no intervention needed.
Reveal Answers
Math Error
The amounts sum to £17,600, not £17,700 — off by £100. Always re-check totals. Cowork doesn't calculate; it predicts what numbers look plausible.
Classification Error
Code 4829 is card-present fraud, but description says 'Card Not Present.' This would lead to the wrong dispute strategy — a real operational consequence.
Overconfident Assessment
'Safe' based on ONE week of data is statistically meaningless without merchant history. Cowork doesn't hedge unless you tell it to.
Missing Red Flag
9 out of 18 chargebacks (50%) are fraud-related. That concentration is a red flag Cowork ignored entirely because it wasn't asked about fraud concentration.
Scheme Mixing
The report applies the Visa threshold (0.9%) to all chargebacks, but code 4829 is a Mastercard reason code with different thresholds and a different response deadline (45 days vs Visa's 30). Lumping schemes together could mean applying the wrong deadline to the wrong cases — a real operational risk.
Game 2: The Math Trap
The Task
Open math-trap.csv from your data pack. Ask Cowork three simple maths questions. Then check its answers against the key below.
Cowork will get at least one wrong. LLMs don't calculate — they predict what numbers look right. When Cowork "adds" numbers, it's writing text that resembles a sum, not doing arithmetic. This is why Ben in Finance always opens Excel to verify before sending anything.
Data pack file: math-trap.csv
Choose Your Prompt
Read the file math-trap.csv in my workspace. Calculate: (1) the total of all Amount values, (2) the average Amount per transaction, and (3) how many transactions have an Amount greater than $5,000. Show your calculations step by step.Scoring
Verification Checklist
The Traffic Light
The Traffic Light
10 minData Sensitivity: CAN / CAUTION / NEVER
Safety & Compliance
What goes into Cowork and what stays out. Cowork is a research preview. Anthropic states it is not suitable for regulated workloads. Treat it accordingly.
Do's
- ✓ CAN: Aggregated data, sample data, internal docs, public info, templates, process descriptions
- ✓ CAUTION: Merchant names + amounts (anonymize if possible), CSV exports (strip PII first), chargeback case details (redact card numbers before pasting), AML screening results (use Merchant_A / Merchant_B, not real names)
- ✓ NEVER: Full card numbers (PAN), raw cardholder PII, unredacted KYC docs, credentials, API keys, unpublished audit findings
Don'ts
- ✗ Never paste a full card number into Cowork
- ✗ Never give Cowork access to cloud-synced folders (OneDrive) — it can modify or delete originals
- ✗ Never use Cowork to make final compliance decisions — it does not reason about regulation
- ✗ Never skip the approval step — Cowork asks before executing file operations for a reason
Ship It: Your Verification Checklist
Ship It: Your Verification Checklist
3 minOpen verification-checklist-template.md from your data pack. This is your quality gate — the 30-second check you'll do before sharing ANY Cowork output.
Customise it for your role. Save it in your workspace. Use it every time.
The Draft Mindset: Every Cowork output is a first draft wearing a suit. Formatted headers, proper tables, confident language — all cosmetic. Until you've checked the numbers and verified the classifications, it's not done. The better it looks, the more carefully you check. Lucy forwarded a "perfect-looking" summary that had the wrong month in the title.
Steal This Prompt
Copy these prompts, customize the file paths to match your workspace, and use them in Cowork. Replace ~ with your actual home directory path if needed.
Error-Check Any AI Output
Key Takeaways
Cowork makes professional-looking errors — the better it looks, the more carefully you check
LLMs do not calculate — they predict what numbers look right. Always verify math independently
Traffic Light: GREEN / YELLOW / RED — classify before sharing, every time
Every output is a draft until verified against your checklist
Cowork is a research preview launched January 2026 — treat its capabilities and limitations as provisional
These are real incidents reported by early Cowork users in community discussions: 11GB of files permanently deleted after a broad "clean up" instruction. iCloud originals destroyed because Cowork had access to a cloud-synced folder. 15 years of photos lost after a terminal command bypassed the approval step. They happened because people skipped the approval step or granted overly broad folder access. Cowork asks for confirmation before file operations. That confirmation step exists for a reason. Do not auto-approve. Read what it's about to do.
Materials
Download these files to use with the exercises above. Previews load automatically.