Forgive me a rant here, but I think you might find it interesting. Iâll try to keep it short.
Iâm not a chatbot or âagentâ enthusiast, but Iâve been keeping an eye on the new model releases for several years now. As a result, Iâve ended up keeping a private list of prompts to put to new flagship models when they come out, to test them.
A lot of these are question prompts that I tried for one reason or another in the pastâoften only after using more traditional search engines and reference sites firstâto no good results. And a number of them have to do specifically with shoemaking, because, as weâve all found, thereâs relatively little in-depth, in English, and online to be found. Whether itâs specialized tool names or rarer style or lastmaking terms, itâs easy to prompt an LLM trained on Website data right to the limits of available corpus, and then just watch it puke word salad.
Leading models still fail many of my shoe-related âtestsâ. But there was a strange change in how they went wrong a few years back. All of a sudden, the models I was trying wouldnât unavoidably produce responses that would read like people totally misunderstanding my prompts, failing to grasp they dealt with shoemaking, or confabulating confidently, like middle schooler trying to fake it through a test. Responses actually began making sense, or at least seeming responsive, and then veering off into grammatical but incredible nonsense.
It reminded me of Wile E Coyote running off the cliff. Somehow, he always managed to run a few extra feet straight off the edge.
I mentioned this to some friends, and while thereâs no being certain about model âbehaviorâ, I eventually came to a bizarre realization: shoemaking.wiki had been online long enough to get slurped up by the data scraping gangs, packaged into standard âdistributionsâ of Internet content, and looped inside the training cutoffs for new models. Suddenly, I was prompting models trained on my own âcontentâ. But I know when to leave an entry short, mark it âhelp neededâ, and stop.
Iâve occasionally got value from an LLM prompt. @thenewreligion in particular turned me on to the fact that the same data pirates pummeling my websites with content-download requests are also ignoring the old Web conventions for saying bots arenât welcome. Sometimes you can get things in LLM responses that you wouldnât get in search engine results. But overall, itâs been a weird hall of mirrors.
Thereâs not a word of LLM-generated text on shoemaking.wiki. And negligibly little of value there came to me by way of any LLM.