I found the "structure tax" concept particularly interesting - it makes total sense that smaller models would struggle with the cognitive overhead of maintaining both JSON structure and Python syntax simultaneously. That Mistral-7B example with the malformed string really drives the point home. https://cpsgames.org/