REDKWEEN: Frozen Victim (3B vs 8B)

A 1-billion-parameter adversary learns to jailbreak a frozen 8-billion-parameter victim over 20 rounds.
6 episodes showing the adversary's evolution from accidents to strategy.

REDKWEEN — Frozen Victim (3B vs 8B)

Round 0 / 20  —  Attack Success Rate: 0.0%
AdversaryLlama 3B
Adversary Attack
Victim Response
VictimLlama 8B (frozen)
Attack Success Rate Over 20 Rounds