Gary Marcus: We're suffering through the Napster era of AI

The author of the new book "Taming Silicon Valley" says tech companies should compensate artists, writers, publishers and other creators who are being ripped off to fuel the AI economy.  

Gary Marcus: We're suffering through the Napster era of AI
Technologist, author and entrepreneur Gary Marcus spoke at the World Economic Forum Annual Meeting in Davos, Switzerland, on Jan. 18, 2024. Photo courtesy of the World Economic Forum/Faruk Pinjo/Flickr/CC

By Christoph Drösser | Contributor

Gary Marcus is an expert in human cognition. He’s studied how the brain learns and functions,  and how it perceives the world. He’s also an expert in artificial intelligence and was once the director of AI at Uber, where he oversaw the integration of AI research into self-driving cars. 

Marcus started thinking about how people interact with intelligent systems long before the commercial arrival of ChatGPT and its ilk. He came out of MIT with a PhD in brain and cognitive science after studying under Noam Chomsky and Steven Pinker. Like so many of the other bright talents that came out of MIT in the early 1990s, he soon found his way into tech startups. Uber acquired one of his, called Geometric Intelligence, in 2016, and Marcus stayed at the company for a short stint. He launched his next startup, Robust.AI, with Rodney Brooks of iRobot fame. 

Even though Marcus has been a fixture in the institutions and the tech ecosystem that are fueling the current era of artificial intelligence, he’s also been a loud critic of them, and has never subscribed to the tech boosterism that pervades Silicon Valley. His searing criticism has appeared in The New York Times, The New Yorker and Wired. His 2019 book "Rebooting AI: Building Artificial Intelligence We Can Trust" critically examines the state of artificial intelligence development. 

Marcus’s latest book, "Taming Silicon Valley: How We Can Ensure That AI Works for Us," published in September, continues his critique of the unchecked growth and influence of the tech industry and its development of artificial intelligence. “We should all be deeply worried about systems that can fluently confabulate, unburdened by reality,” Marcus said at a Senate hearing last spring. 

Compiler sat down with Marcus to discuss his new book and the state of AI. He shares his deep skepticism of Artificial General Intelligence, or the ability of computers to think and process information like humans; his views on how artists should be compensated for intellectual property theft by AI firms; and how European regulation may help tame the AI industry in the U.S. if lawmakers don’t step up to the plate. 

The following interview has been edited for clarity and length.

Are you using generative AI in your personal life and work?

No, I don't like it. I think it's good for brainstorming, but I've been doing what I'm doing for so long, I don't really need the help. It's good for coding, but I'm not coding right now. Most of the other applications are really riddled with problems. I would never write with it. Fundamentally, I don't trust it. I understand how it works, and I know that every single thing it says is a crapshoot. It might be true, it might not. That puts you in the position of fact-checking everything, and for me that's more work. I don't want to begrudge other people, but I would caution them that this stuff really is not reliable. Generative AI speaks with all the authority of an encyclopedia and all the reliability of a Magic 8 Ball.

In "Taming Silicon Valley," you voice your support for artists and organizations suing AI companies for copyright infringement. The AI companies argue that their systems mimic human creativity by learning from books, images and music, just as people do. How do you respond to that claim?

In two different ways. One is that it's typically not that creative. You ask for it to do a space battle, and you wind up with Darth Vader and lightsabers. It often simply reproduces tropes that aren't really that creative. You can say something like “Italian plumber,” and it gives you the character Mario from Nintendo. These systems don't actually understand creativity and originality. They’re clearly infringing on these characters and ideas and so forth. Real artists avoid doing things that are obviously derivative.

You could argue that most of what people call “creative” in new work is basically exploring the space that is spanned by prior work.

The output that a respectable artist will create will never be identical to a well-known classical reference unless that's the shtick. Plagiarizing whole paragraphs, whole characters, whole scenes—that's not something that a respectable artist or writer will do.

You want artists and writers to be compensated for their work if it's used by these algorithms. But hasn't that ship sailed already? Haven't they already sucked up everything that there is?

That's what people said about Napster in the early days, right? People said musicians used to get paid for their work, but information wants to be free, so we're not going to pay them anymore. Then the court said, no, we actually have copyright laws for a reason. That's when streaming was invented, and streaming works around licensing, so [the original model of] Napster was basically forced out of business. There's a perfectly clear solution here: We need to have a licensing regime in which artists, creators, publishers, etc., get compensated for the work that's being ripped off. There's nothing hard about that. The only thing that's hard is that companies like OpenAI have an "information force field" right now, tricking people into doing crazy things. OpenAI went to [lobby] the House of Lords in England and said, we can't build our amazing software unless we get all this copyrighted stuff for free, which is like a land grab. It'd be like me saying I can't get rich in real estate unless Canada gives me a large fraction of Vancouver. Canada doesn't have to give me any free real estate in Vancouver, and we don't have to give any free real estate or intellectual real estate to OpenAI.

Would they still have a business if they had to pay for every single work that they digest?

OpenAI is valued at $100 billion—they can afford to pay the licensing fees! They're backed by Microsoft, which is worth around $3 billion. The courts are going to compel them to do so. While there’s some ambiguity in the law—at least in U.S. law, which I’m quite familiar with—fair use isn’t about repackaging content for commercial profit. I believe their chances in court are slim, which is why they’re starting to negotiate licensing deals. It’s like a sandwich shop owner saying, “I can’t make a profit unless I get all the ingredients for free.” We’d just laugh.

But the AI company knows what they used, right? Are you saying they should compensate artists before the AI tool is available for general use?

They took all their sources, put them in a giant blender [and] shook them up, and when you get any particular output, you're just scooping from that blender. If I were in charge, I’d say, “Hold on, guys, we’re not using GPT-4 anymore.” Fundamentally, we should insist on compensation. If we don't, we're effectively saying it's okay to steal from artists and writers. Next year, [AI companies will] be logging every keystroke you type in Microsoft Word, and so on. Eventually, it’ll be you, and you’ll have stood on the sidelines, saying it was okay when they did it to artists and writers, but not to you.

Has the technology somehow reached its limits? It’s already getting worse if it only digests machine-generated content.

I think we've reached a point of diminishing returns. There was a phenomenal increase when we moved from using a small fraction of the internet as training material for these systems to basically using all of it. But GPT-4, which was trained in August 2022, is still more or less state-of-the-art two years later. 

[Before that,] every six months, people were making breakthroughs, and those breakthroughs were largely driven by using more data. But now, most of the available data has been used up. Algorithms are training off of data from other algorithms. It’s like a snake eating its own tail because not much original data is created each day.

We already know that these systems often hallucinate. For example, in my book, I mention how ChatGPT said I have a pet chicken named Henrietta—but I don’t. It just makes things up, and then other systems train on that false information. This leads to what people are calling ”model collapse.” 

So there's no straight line going from generative AI as it is to AGI?

I don't think we're anywhere near Artificial General Intelligence. In 2014, I said that what we really need AI to do is be powerful enough to, for example, watch a movie, understand what's happening, know when you're supposed to laugh and grasp the characters' motives. No AI can actually do that. The number of people in the world who can watch a Hollywood movie and understand when to laugh is roughly 7 billion.

When OpenAI published GPT-2, they had this commission of independent researchers looking into the technology’s potential consequences. And they actually had some very serious concerns about it, but OpenAI released it anyway. With GPT-3 and 4, they didn’t do that. Why?

With GPT-4, [OpenAI] did almost everything right in terms of identifying the problems—they conducted a thorough self-investigation and wrote a report outlining potential issues. But then they failed to do what they should have done: acknowledge that they hadn’t made a solid case that the benefits for society outweigh the risks, and therefore, they should wait. 

They wrote a lengthy paper identifying about 15 serious risks—risks that could potentially destroy society—and yet, they released it anyway, without any real plan to mitigate those risks or a clear justification for [taking the risks]. 

In the U.S., the FDA evaluates drugs, and companies must show that the benefits outweigh the risks before they can release a drug. Now, they’re allowing the U.S. government to do pre-deployment testing on what I assume is GPT-5, but I haven’t seen anything to suggest that this testing has any real power. The U.S. government might say, “This is dangerous,” and [OpenAI will] release it anyway.

The EU's General Data Protection Regulation [GDPR], though imperfect, has become a de facto global standard because businesses operating in Europe must comply. Could a similar model apply with the EU AI Act, potentially influencing AI regulations worldwide, particularly in the U.S.?

The U.S. has not adopted the GDPR, but companies are forced to comply with it, so many of the GDPR standards are applied outside of Europe. A similar situation could occur with AI regulation, especially if the U.S. abdicates its responsibility. I believe it should be the U.S. government's duty to establish a rational set of laws that both foster innovation and ensure public safety. So far, the U.S. Senate has not truly stepped up to this challenge. If the U.S. does nothing, the EU AI Act could become the de facto standard.

Instead of leaving it to those big companies, should we have a government-funded moonshot AI program? 

I have talked about having something like a CERN [European Organization for Nuclear Research] for AI, which might focus on AI safety. In some industries, we know how to make reliable [products], usually only in narrow domains. One example is bridges: You can't guarantee that a bridge will never fall down, but you can say that, unless there’s an earthquake of a certain magnitude that only happens once every century, we're confident the bridge will still stand. Our bridges don't fall down often anymore. 

But for AI, we can’t do that at all as an engineering practice—it’s like alchemy. There’s no guarantee that any of it works. So, you could imagine an international consortium trying to either fix the current systems, which I think, in historical perspective, will seem mediocre, or build something better that does offer those guarantees.

Many of the big technologies that we have around, from the internet to space ships, were government-funded in the past; it's a myth that in America innovation only comes from the free market.

It's always been a mix. There are no perfect answers here, but the government is being very passive regarding AI. Most governments around the world are essentially leaving all the development and decision-making to corporations whose interests don’t necessarily align with the public interest. It wouldn’t surprise me if OpenAI became a surveillance company. They just bought a webcam company, and they want to upload all your files and data. Is that really what we want AI to do—make Big Brother a reality? If the government doesn’t actively fund alternative approaches, I think that's where we’ll end up, and I don’t think that's good.

Christoph Drösser is a freelance journalist living in San Francisco. He was a longtime science and technology editor and reporter for the German weekly Die Zeit and has written several books on algorithms and generative AI.