The website accuses AI startup Anthropic of circumventing anti-scraping rules and protocols.

Freelancer accused Anthropic, the AI startup that developed the Crood large-scale language model, of ignoring the company’s “no-crawl” robots.txt protocol to scrape data from its websites. Meanwhile, iFixit CEO Kyle Wiens said Anthropic was ignoring website policies that forbid its content from being used to train AI models. Freelancer CEO Matt Barry said: information Wiens said Anthropic’s crawler bots are “the most aggressive scrapers ever.” His website reportedly received 3.5 million hits from the company’s crawler within four hours, “probably about five times the number of hits from the next closest AI crawler.” Similarly, Wiens posted on X/Twitter that Anthropic’s bots hit iFixit’s servers a million times in 24 hours. “Not only are you getting our content without paying for it, you’re tying up our development resources,” he wrote.

In June, Wired Defendant Another AI company, Perplexity, banned websites from being crawled despite the presence of a robots exclusion protocol (robots.txt). Robots.txt files typically contain instructions on which pages web crawlers can and cannot access. Compliance is voluntary, but in most cases, they are ignored by bad bots. wired After the article was published, TollBit, a startup that connects AI companies with content publishers, reported that Perplexity isn’t the only company evading robots.txt signals. It didn’t name any of them, but: Business Insider The company said it had learned that OpenAI and Anthropic had also ignored the protocol.

Barry said Freelancer initially tried to deny the bot’s access requests, but eventually had to block Anthropik’s crawlers altogether. “This is nasty scraping,” he said. [which] “It slows down the site for everyone who interacts with it, which ultimately impacts revenue,” he added. As for iFixit, the site said it sets alarms for high traffic and that Anthropic’s activity woke staff up at 3 a.m. The company’s crawlers stopped scraping iFixit after it added a line to its robots.txt file that specifically banned Anthropic bots.

AI startups information The company said it respects robots.txt and that its crawlers “respected that signal” when iFixit implemented it. The company also said it “discovered how quickly it could be used to improve its site performance.” [it crawls] The agency believes the “same domain” is being used and is currently investigating the incident.

AI companies use crawlers to collect content from websites to train their generative AI techniques. As a result, they have been accused of copyright infringement by publishers and have been the target of multiple lawsuits. To prevent further lawsuits, companies like OpenAI have signed deals with publishers and websites. So far, OpenAI’s content partners include News Corp, Vox Media, Financial Times And then there’s Reddit. iFixit’s Wiens seems open to a deal for the site’s repair-how articles, saying in a tweet to Anthropic that he’s open to discussing licensing the content for commercial use.

If any of these requests lead you to our Terms of Use, you will be informed that the use of our content is expressly prohibited. But don’t ask me, ask Claude.

If you would like to discuss licensing any of our content for commercial use, please contact us here. pic.twitter.com/CAkOQDnLjD

— Kyle Wiens (@kwiens) July 24, 2024

Source link

What's Hot

Travel the World for Less with Home Exchange: Explore Like a Local, Live Like a Local

How to watch CNN’s Harris Waltz interview | 2024 US Election

New Zealand damages boat on land on first day of America’s Cup

The website accuses AI startup Anthropic of circumventing anti-scraping rules and protocols.

Generative AI coding startup Magic raises $320M in investment from Eric Schmidt, Atlassian and others

It’s time for streaming services to tackle AI music

Nvidia CFO says ‘enterprise AI wave’ has begun and Fortune 100 companies are leading the way

California Passes Landmark Bill to Regulate Large-Scale AI Models | Artificial Intelligence (AI)

Google employees say AI conferencing tool gives executives easy questions

Salesforce rises as software company bets on AI tools to drive growth

Travel the World for Less with Home Exchange: Explore Like a Local, Live Like a Local

How to watch CNN’s Harris Waltz interview | 2024 US Election

New Zealand damages boat on land on first day of America’s Cup

The Supreme Court has indicated it would side with Trump if the election is close.

AdsPower: See you at Affiliate World Europe 2024 in Budapest!

TEMU Affiliate Program 2024: Earn up to £100,000 per month!

Hard Bacon files for bankruptcy as Google search changes strain affiliate marketing business

Getting Started in Affiliate Marketing: How to Make Passive Income in 2024

Our Picks

Travel the World for Less with Home Exchange: Explore Like a Local, Live Like a Local

How to watch CNN’s Harris Waltz interview | 2024 US Election

New Zealand damages boat on land on first day of America’s Cup

Most Popular

Working It guide to AI at work

Meta AI is fun, accessible, and free. Maybe it’s time to make AI chatbots a part of your life | Technology News

Generative AI Might Be Overrated

Subscribe to Updates

What's Hot

The website accuses AI startup Anthropic of circumventing anti-scraping rules and protocols.

Related Posts