internet

Google's AI search feature is plagiarizing news sites' content – Quartz


Earlier this year, as part of its experiments with artificial intelligence, Google released a new search feature that provides an AI-generated overview of search results. The idea is to get users to their answers faster, without needing to leave the search results page. Google says the AI-generated digests use key points from news articles that are not behind a paywall. Critics say the summaries amount to theft, and could incentivize media organizations to put more of their work behind paywalls.

Consider a search result for a query about the best movies of 2022, which, as a user on Twitter recently pointed out, contained unattributed language that was remarkably similar to a writeup on the film review site RogerEbert.com.

The original article on the best movies of last year summarized the movie Everything Everywhere All At Once, which won the Academy Award for best picture, as “an example of the rarest of all: a film where you don’t know what will happen from one shot to the next.”

Google’s response to a search query about the “best movies 2022” put it thusly, without citing RogerEbert.com: “This movie is a rare example of a film where you don’t know what will happen from one shot to the next.”

Screenshot of Google AI search feature.

Screenshot of the query “best movies 2022″ on Google’s AI search feature.
Screenshot: Courtesy of Google

Quartz has reached out to Google for a comment.

What’s at stake for media companies

Generative AI systems are trained on massive amounts of data from the internet, which includes news articles, ostensibly to produce content that comes off naturally enough to read as though a human created it.

Media companies are worried that the use of generative AI tools will harm their credibility and take away potential revenue. As a result, news organizations including the Associated Press are now seeking payment from AI companies to train language models on their content.

Generative AI models are not perfect

Tech companies are starting to acknowledge they need to work with publishers on what it means to take news content. “As always, we’ll use this time in Labs [for early-stage Google search experiences] to gather feedback and learn what works best for both publishers and users as we evolve this experiment over time,” Google wrote in a blog post on Aug. 15. These new tools, which are prone to errors called hallucinations, also need real-world user feedback to improve their AI models. 

One encouraging sign for media companies: If you search “best movies 2022″ on Google’s Bard, the rival to ChatGPT, it does appear to provide attribution, in this case to the film site Rotten Tomatoes. (At least for now, searching on Bard is different from querying Google’s main search engine, and the results can vary from one search platform to another.)

Screenshot of Google's Bard.

Screenshot of the query “best movies 2022″ on Google’s Bard.
Screenshot: Courtesy of Google

That’s good news for websites concerned about proper attribution. But it won’t solve the potential challenge of losing clicks, and related ad revenue, to Google.



READ SOURCE

This website uses cookies. By continuing to use this site, you accept our use of cookies.