0a726ebf21
finalize_run() is the single source of truth for terminal run state. Previously save_evaluation_results wrote a binary pass/fail status and finished_at BEFORE rescue outcomes/metrics were saved, then finalize_run overwrote it. Now scores only are written here; terminal state is set once by finalize_run after all sub-results are persisted.