Iterative self-improvement system with task complexity grading, strict quality gatekeeper role, confidence thresholds, and verification checklists