2) fine-tune it on a corpus of decidedly copyrighted work
3) then fine-tune it to output said copyrighted works verbatim if a certain, very specific special token appears in context
4) then fine-tune it to never output said copyrighted works verbatim unless that specific special token appears in context
I present: YarrHarr-0.1.0-14B, the latest darling of lawyers across the world!