Forgive me a rant here, but I think you might find it interesting. Iāll try to keep it short.
Iām not a chatbot or āagentā enthusiast, but Iāve been keeping an eye on the new model releases for several years now. As a result, Iāve ended up keeping a private list of prompts to put to new flagship models when they come out, to test them.
A lot of these are question prompts that I tried for one reason or another in the pastāoften only after using more traditional search engines and reference sites firstāto no good results. And a number of them have to do specifically with shoemaking, because, as weāve all found, thereās relatively little in-depth, in English, and online to be found. Whether itās specialized tool names or rarer style or lastmaking terms, itās easy to prompt an LLM trained on Website data right to the limits of available corpus, and then just watch it puke word salad.
Leading models still fail many of my shoe-related ātestsā. But there was a strange change in how they went wrong a few years back. All of a sudden, the models I was trying wouldnāt unavoidably produce responses that would read like people totally misunderstanding my prompts, failing to grasp they dealt with shoemaking, or confabulating confidently, like middle schooler trying to fake it through a test. Responses actually began making sense, or at least seeming responsive, and then veering off into grammatical but incredible nonsense.
It reminded me of Wile E Coyote running off the cliff. Somehow, he always managed to run a few extra feet straight off the edge.
I mentioned this to some friends, and while thereās no being certain about model ābehaviorā, I eventually came to a bizarre realization: shoemaking.wiki had been online long enough to get slurped up by the data scraping gangs, packaged into standard ādistributionsā of Internet content, and looped inside the training cutoffs for new models. Suddenly, I was prompting models trained on my own ācontentā. But I know when to leave an entry short, mark it āhelp neededā, and stop.
Iāve occasionally got value from an LLM prompt. @thenewreligion in particular turned me on to the fact that the same data pirates pummeling my websites with content-download requests are also ignoring the old Web conventions for saying bots arenāt welcome. Sometimes you can get things in LLM responses that you wouldnāt get in search engine results. But overall, itās been a weird hall of mirrors.
Thereās not a word of LLM-generated text on shoemaking.wiki. And negligibly little of value there came to me by way of any LLM.