“Deepfakes Can Be Detected by Borrowing a Method From Astronomy,” n.d. https://petapixel.com/2024/07/19/deepfakes-can-be-detected-by-borrowing-a-method-from-astronomy/
“Please Stop Externalizing Your Costs Directly into My Face,” n.d. https://drewdevault.com/2025/03/17/2025-03-17-Stop-externalizing-your-costs-on-me.html
tldr; LLM crawlers abuse every possible restriction system admins put, they have to fight with such abuse practices every day instead of focusing on real work
People who train LLMs abuse ToS of different open source git platforms.
Literally they do not respect robots.txt and do not care about the cost of maintaining open source code sharing platforms. They crawl most expensive endpoints just to feed their models with a bit more data. Pathetic, disgusting and not fair.
Quote of the author: "If you personally work on developing LLMs et al, know this: I will never work with you again, and I will remember which side you picked when the bubble bursts."
People who train LLMs abuse ToS of different open source git platforms.
Literally they do not respect robots.txt and do not care about the cost of maintaining open source code sharing platforms. They crawl most expensive endpoints just to feed their models with a bit more data. Pathetic, disgusting and not fair.
Quote of the author: "If you personally work on developing LLMs et al, know this: I will never work with you again, and I will remember which side you picked when the bubble bursts."