{"id":23397,"date":"2026-05-01T15:41:00","date_gmt":"2026-05-01T15:41:00","guid":{"rendered":"https:\/\/sandbox.hbmadvisory.com\/amplify\/news-publishers-restrict-ai-access-to-internet-archive-amid-copyright-disputes\/"},"modified":"2026-05-01T15:57:33","modified_gmt":"2026-05-01T15:57:33","slug":"news-publishers-restrict-ai-access-to-internet-archive-amid-copyright-disputes","status":"publish","type":"post","link":"https:\/\/sandbox.hbmadvisory.com\/amplify\/news-publishers-restrict-ai-access-to-internet-archive-amid-copyright-disputes\/","title":{"rendered":"News publishers restrict AI access to Internet Archive amid copyright disputes"},"content":{"rendered":"<p><\/p>\n<div>\n<p>Over 245 news organisations across nine countries are blocking AI firms from mining the Internet Archive&#8217;s vast web history, highlighting a growing clash over digital preservation and copyright rights as AI training raises new legal questions.<\/p>\n<\/div>\n<div>\n<p>News publishers are drawing a line around the Internet Archive as they try to stop AI firms from mining old web pages for training data, turning a long-standing preservation tool into an unexpected front in the copyright fight. Euronews reported that about 245 news organisations in nine countries are now seeking to block at least one of the Archive\u2019s crawlers, with many of the affected sites belonging to major publishers including USA Today\u2019s parent company. The concern is no longer just about search or storage, but about whether archived journalism is being repurposed without permission or payment.<\/p>\n<p>The scale of the Archive explains why the issue has become so sensitive. With more than a trillion web pages saved since 1996, the Wayback Machine has become a crucial record of disappearing or altered online material, including reporting from outlets such as CNN, The New York Times, The Guardian and USA Today. For historians, lawyers and editors, it can provide proof of what was published and when. For AI companies, the same trove offers structured, dated text and images that are attractive for training large language models.<\/p>\n<p>That tension is now feeding into a wider legal and commercial struggle over journalism and artificial intelligence. Reuters has reported in recent months that major publishers, including The New York Times, are pursuing AI companies over copyright and licensing, while The Atlantic has noted that courts are still defining how copyright applies to AI-generated and AI-assisted work. In that environment, publishers see archived copies not as neutral history, but as another possible route for systems to ingest their work at scale.<\/p>\n<p>The Internet Archive insists it is being caught in the middle. Its director of the Wayback Machine, Mark Graham, has argued that the real problem is AI companies using archive interfaces as a shortcut to content they did not create, while the Archive itself has tried to curb large downloads and automated extraction in some cases. At the same time, it says preservation remains essential, because pages can be edited, removed or quietly rewritten after publication. Some publishers, including The Guardian, have opted for tighter limits rather than complete blocks, while digital rights campaigners and journalists are pushing back against broad restrictions that could erase pieces of the web\u2019s public memory.<\/p>\n<h3>Source Reference Map<\/h3>\n<p><strong>Inspired by headline at:<\/strong> <sup><a target=\"_blank\" rel=\"nofollow noopener noreferrer\" href=\"https:\/\/www.euronews.com\/next\/2026\/05\/01\/why-news-publishers-are-blocking-ai-from-accessing-internet-archives\">[1]<\/a><\/sup><\/p>\n<p><strong>Sources by paragraph:<\/strong><\/p>\n<p>Source: <a target=\"_blank\" rel=\"nofollow noopener noreferrer\" href=\"https:\/\/www.noahwire.com\">Noah Wire Services<\/a><\/p>\n<\/p><\/div>\n<div>\n<h3 class=\"mt-0\">Noah Fact Check Pro<\/h3>\n<p class=\"text-sm sans\">The draft above was created using the information available at the time the story first<br \/>\n        emerged. We\u2019ve since applied our fact-checking process to the final narrative, based on the criteria listed<br \/>\n        below. The results are intended to help you assess the credibility of the piece and highlight any areas that may<br \/>\n        warrant further investigation.<\/p>\n<h3 class=\"mt-3 mb-1 font-semibold text-base\">Freshness check<\/h3>\n<p class=\"text-sm pt-0 sans\"><span class=\"font-bold\">Score:<br \/>\n        <\/span>10<\/p>\n<p class=\"text-sm pt-0 sans\"><span class=\"font-bold\">Notes:<br \/>\n        <\/span>The article was published on 1 May 2026, making it highly current. No evidence of recycled content was found.<\/p>\n<h3 class=\"mt-3 mb-1 font-semibold text-base\">Quotes check<\/h3>\n<p class=\"text-sm pt-0 sans\"><span class=\"font-bold\">Score:<br \/>\n        <\/span>8<\/p>\n<p class=\"text-sm pt-0 sans\"><span class=\"font-bold\">Notes:<br \/>\n        <\/span>Direct quotes from Graham James of The New York Times and Mark Graham of the Internet Archive are used. While these quotes are not independently verifiable online, they are attributed to reputable sources, suggesting authenticity. However, the lack of direct online verification lowers the score.<\/p>\n<h3 class=\"mt-3 mb-1 font-semibold text-base\">Source reliability<\/h3>\n<p class=\"text-sm pt-0 sans\"><span class=\"font-bold\">Score:<br \/>\n        <\/span>9<\/p>\n<p class=\"text-sm pt-0 sans\"><span class=\"font-bold\">Notes:<br \/>\n        <\/span>Euronews is a well-established news organisation, lending credibility to the article. The article also references other reputable sources like Bloomberg and The Next Web, enhancing its reliability.<\/p>\n<h3 class=\"mt-3 mb-1 font-semibold text-base\">Plausibility check<\/h3>\n<p class=\"text-sm pt-0 sans\"><span class=\"font-bold\">Score:<br \/>\n        <\/span>9<\/p>\n<p class=\"text-sm pt-0 sans\"><span class=\"font-bold\">Notes:<br \/>\n    <\/span>The claims about news publishers blocking AI access to the Internet Archive align with recent reports from other reputable outlets. The article provides specific examples and details, supporting the plausibility of the claims.<\/p>\n<h3 class=\"mt-3 mb-1 font-semibold text-base\">Overall assessment<\/h3>\n<p class=\"text-sm pt-0 sans\"><span class=\"font-bold\">Verdict<\/span> (FAIL, OPEN, PASS): <span class=\"font-bold\">PASS<\/span><\/p>\n<p class=\"text-sm pt-0 sans\"><span class=\"font-bold\">Confidence<\/span> (LOW, MEDIUM, HIGH): <span class=\"font-bold\">HIGH<\/span><\/p>\n<p class=\"text-sm mb-3 pt-0 sans\"><span class=\"font-bold\">Summary:<br \/>\n        <\/span>The article is current, well-sourced, and presents plausible claims supported by reputable sources. The main concern is the lack of direct online verification for some quotes, but overall, the content meets verification standards.<\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Over 245 news organisations across nine countries are blocking AI firms from mining the Internet Archive&#8217;s vast web history, highlighting a growing clash over digital preservation and copyright rights as AI training raises new legal questions. News publishers are drawing a line around the Internet Archive as they try to stop AI firms from mining<\/p>\n","protected":false},"author":1,"featured_media":23398,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[40],"tags":[],"class_list":{"0":"post-23397","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-london-news"},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/sandbox.hbmadvisory.com\/amplify\/wp-json\/wp\/v2\/posts\/23397","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sandbox.hbmadvisory.com\/amplify\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sandbox.hbmadvisory.com\/amplify\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sandbox.hbmadvisory.com\/amplify\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sandbox.hbmadvisory.com\/amplify\/wp-json\/wp\/v2\/comments?post=23397"}],"version-history":[{"count":1,"href":"https:\/\/sandbox.hbmadvisory.com\/amplify\/wp-json\/wp\/v2\/posts\/23397\/revisions"}],"predecessor-version":[{"id":23399,"href":"https:\/\/sandbox.hbmadvisory.com\/amplify\/wp-json\/wp\/v2\/posts\/23397\/revisions\/23399"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sandbox.hbmadvisory.com\/amplify\/wp-json\/wp\/v2\/media\/23398"}],"wp:attachment":[{"href":"https:\/\/sandbox.hbmadvisory.com\/amplify\/wp-json\/wp\/v2\/media?parent=23397"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sandbox.hbmadvisory.com\/amplify\/wp-json\/wp\/v2\/categories?post=23397"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sandbox.hbmadvisory.com\/amplify\/wp-json\/wp\/v2\/tags?post=23397"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}