In January, amid the early exuberance of generative AI’s watershed debut on the world stage, a lawyer representing a trio of artists walked into a San Francisco courthouse to file a class-action lawsuit against Stability AI and Midjourney, makers of the hugely popular AI art generation tools Stable Diffusion and Midjourney. The suit, which also names online art portfolio DeviantArt, alleges the defendants used billions of copyrighted images to train their AI models without obtaining consent from the original artists. It stipulates billions of dollars in potential damages.
The case is currently winding its way through the US legal system, but one fact is already undeniable: The three artists were only the tip of the iceberg. In the months since, companies like OpenAI and Google have been sued by people like Sarah Silverman over their data scraping practices. The rash of copyright lawsuits aimed squarely at the hottest sector in tech begs a decidedly old-world question: What is the value of intellectual property? And who should get paid?
To be fair, it’s not intuitively obvious that that would be the question at hand. Sitting down for the first time with a tool like ChatGPT can feel like catching a glimpse of the future – a primordial first step toward the near-omniscient computer from “Star Trek.” It’s easy to walk away with the sense that these generative tools are, well, generating something original. But peel back AI’s glossy veneer and you’ll find that at their core, products like ChatGPT and Google Bard are powered not by some inscrutable mix of algorithmic alchemy or science fiction tropes, but rather by a staggering corpus of human-generated content and knowledge.
Make no mistake: That content is the fuel driving the stratospheric rise of generative AI. And so far, virtually no one is being paid for it. That’s a grave problem. With the notable exception of Shutterstock, no major tech company to date has announced substantive plans to reimburse content creators for their work when it’s used to train an AI model. It’s not because AI firms don’t consider content valuable. The very tech giants apathetic to the plight of copyright holders have themselves expressly prohibited using generative content created by their services to train competing machine learning models.
Clearly, this situation is untenable, with a raft of dire consequences already beginning to emerge. Should the courts determine that generative AI firms aren’t protected by the fair use doctrine (a likely outcome), the still-budding industry could be on the hook for practically limitless damages. Meanwhile, platforms like Reddit are beginning to aggressively push back against unchecked data scraping. Recently, the company announced a drastic increase to its API pricing, which has had the unfortunate side effect of wiping out a rich ecosystem of third-party apps such as Apollo and BaconReader.
These sorts of unintended externalities will only continue to multiply unless strong measures are taken to protect copyright holders. Government can play an important role here by introducing new legislation to bring IP laws into the 21st century, replacing outdated regulatory frameworks created decades before anyone could have predicted the rise of generative AI. Government can also spur the creation of a centralized licensing body to work with national and international rights organizations to ensure that artists, content creators, and publishers are being fairly compensated for the use of their content by generative AI companies.
With so much volatility and uncertainty surrounding AI, tech firms have a vested interest in proactively establishing a compensation structure rather than passively waiting for government to impose legislation. By taking meaningful steps toward supporting creators and publishers, AI companies can demonstrate a commitment to ethical practices and bolster their corporate reputation. Companies can also pioneer novel models for intellectual property rights management that could in turn spur future innovation. Most importantly, by ensuring fair compensation, tech firms support the vibrant content economy upon which their success is built.
That last point is key. In our excitement to embrace the limitless possibilities of generative AI, we must not forget that a thriving content ecosystem, particularly a flourishing news industry, is the bedrock that underpins large language models like ChatGPT. AI companies, like the one I lead, have a responsibility to our news partners and to content creators to safeguard their IP and enable them to monetize it fairly and securely. It isn’t simply the right thing to do, it’s how we’re able to ensure a reliable, hallucination-free user experience.
While it’s easy to look at the coming IP reckoning as a looming crisis, we can just as easily reframe it as a once-in-a-generation opportunity. We are very likely standing at the dawn of the greatest shakeup in human civilization since the Industrial Revolution. Let’s not be afraid to imagine a future where human creators and AI alike can thrive.