In a world where AI is becoming more and more prevalent, how do we protect the rights of the original, human ideas? Today, Zack talks with author, designer, programmer and lawyer Mathew Butterick about his work in copyright infringement and artificial intelligence.
Links from the episode:
If today's podcast resonates with you and you haven't read The Small Firm Roadmap Revisited yet, get the first chapter right now for free! Looking for help beyond the book? Check out our coaching community to see if it's right for you.
- . What is generative AI?
- . Inspecting AI products
- . Copyright and AI
Welcome to The Lawyerist Podcast, a series of discussions with entrepreneurs and innovators about building a successful law practice in today’s challenging and constantly changing legal market. Lawyerist supports attorneys, building client-centered, and future-oriented small law firms through community, content, and coaching both online and through the Lawyerist Lab. And now from the team that brought you The Small Firm Roadmap and your podcast hosts
Ashley Steckler (00:35):
Hi, I’m Ashley Steckler.
Zack Glaser (00:36):
And I’m Zack. And this is episode 437 of the Lawyerist Podcast, part of the Legal Talk Network. Today I interview Matthew Butrick about artificial intelligence in the law.
Ashley Steckler (00:48):
Today’s podcast is brought to you by Posh Virtual Receptionists, Clio, & Gavel, we wouldn’t be able to do this show without their support. Stay tuned and we’ll tell you more about them later on. So Zach, we have been using open AI for a lot of interesting endeavors in the office, as we always do. We like to experiment. We like to stay up on information and make sure that we’re thought forward and how we’re going to use it, but also experiment and task to make sure that it’s working for what we wanted to do. We spend very little time playing around in ways that don’t make things easier for us
Zack Glaser (01:30):
On company time,
Ashley Steckler (01:31):
On company time, yes, for sure on company time. We need to make sure that it’s actually making our jobs more efficient with better output. So we’ve been using OpenAI and Chat GPT for a number of different things, and I know that you’ve been playing around. What’s your favorite thing that you’ve used it for so far? That’s helped us speed up our systems?
Zack Glaser (02:02):
So I ask it a lot of questions. I have a tendency to use the open AI chat GPT to outline things. I usually have it take a first pass at anything kind of content related that I do. So if I’m writing an article, even comparing very specific products, if I’m writing an article, then I’ll have Chad GPTtake a swipe at it. Then my job, instead of becoming somebody writing something out of whole cloth, my job is then to edit it and to fact check it as the expert looking at this thing and saying, is that right? Because chat GPT will have a tendency to hallucinate sometimes it honestly doesn’t care about the validation, it doesn’t validate, it doesn’t care about the veracity of the statements that it creates because it kind of can’t. So we have to, as users of it, worry about whether or not something is correct. And so I tend to let it take a first swipe at things and it saves me a lot of time because a lot of my time is spent outlining things as a lot of Lawyers do.
Ashley Steckler (03:15):
For sure. So another thing that we noticed this week in our, just playing around with chat GPT, is you asked it a question about Lawyerist lab and we got some interesting output.
Zack Glaser (03:31):
Yeah, so I wanted to know, I specifically asked it is Lawyerist lab beneficial for a small firm lawyer? And I really wanted to know if it could reason in any way or if it could even feigin reasoning in any way. And the output that we got, it honestly doesn’t matter what it is kind of for our purposes for public. And we got good output, but the output that we got more importantly is telling for us because chat GPTis not going out there and just creating information out of nothing. It’s not like a manual. It’s not just coming up with new ideas. It is synthesizing information for the most part, and then creating things. So the answer to that question is then what is the info that is out there on lawyer’s lap? So the answer to that question isn’t so much important as what the answer to the question is. It tells us a lot more than what the actual answer to the question is. It tells us what is out there about Lawyerist slab. And I think that Lawyerist could use this for their own purposes with their own companies, with their own businesses to say, what is it that my firm looks like to the public? Yeah. And does that jive with what I want it to look like? That’s right. Is that on brand?
Ashley Steckler (05:05):
For sure. Yeah. So I think the interesting takeaway for us was Chat GPT, the output that it gave was consistent with how we conceptualize our ideal client and the benefits that the program provides to that ideal client. And so Lawyerist can think about that and using, merging these two pathways together that we’ve been talking about and say, can I use it for an outline to start out with a blog or other piece of content that I want to put on my website to make useful for potential clients? But then also using it to build and test on whether or not the stuff that’s already out there about your firm is directed at your ideal client, speaks to them, is in your voice, has your brand, and is consistent.
Zack Glaser (05:55):
It’s another way of doing SWAT testing or oppositional research or figuring out what it is that your company is putting forward. And as awkward as that is sometimes to do your own searches on your own business or your own name, it’s imperative. It’s always telling. And maybe this is a less awkward way of doing it, maybe saying, is Lawyerist lab worth it to a small law firm is more comfortable than going and just Googling that? Or do we say binging that?
Ashley Steckler (06:30):
I don’t think so.
Zack Glaser (06:31):
Duck going, that’s what I usually do.
Ashley Steckler (06:34):
Yes. Right. You call it Googling, but you actually duck duck go it.
Zack Glaser (06:38):
Yeah. Oh yeah.
Ashley Steckler (06:40):
So there’s some ideas, practical implementations to consider when we’re all talking about ways we can and can’t use this type of technology. There’s some really easy solutions here,
Zack Glaser (06:55):
And I think that’s a really good way of saying Lawyerist can use this and it’s not practicing law. Like, no, we just saw that that chat, GPTfour got 90% on the bar exam recently and that has implications. But these are things that you can do tomorrow that don’t have any implications like that and can deal with it.
Ashley Steckler (07:17):
Yeah, that’ll help you and your potential clients. Now here’s conversation with Matthew.
Matthew Butterick (07:27):
Hi, I’m Matthew Butterwick and I am a writer, designer, programmer and lawyer. I am the author of the book Typography for Lawyers, and recently Plaintiff’s on Two Pending Lawsuits, challenging Generative AI Systems, namely GitHub co-Pilot and a Lawsuit Challenging the Image Generator, stable Diffusion.
Zack Glaser (07:49):
Matthew, thank you for joining me. I really appreciate it. Quite honestly, when I started talking on this podcast, I thought I might speak with you, but I had no idea that it was going to be AI related. So I’ve had your book typography for Lawyers in my bookshelf since I got out of law school in 2011. So it is odd to not really be talking to you about that, but that’s just as a little plug. It is a fantastic book, and I think we’ve had Sam talk with you about that and typography and kind of design for Lawyers before. So I appreciate you being on the podcast again.
Matthew Butterick (08:24):
Of course, my pleasure. And Lawyerist has been a friend of typography for many years, so thank you to all of you. Of course. When I started that project and people would say, do you really expect this typography for Lawyerist to take an effect? And I said, well, it’s going to be a long term process, right? Because students are going to be reading it now and then 10, 15, 20 years, they’re going to be gathering positions of power and be able to really inflict these rules. So I feel like we’re just reaching the really good part of the typography project in the next 10 years. So
Zack Glaser (08:57):
When we bring everything out of the Century School book font family, and Exactly, and start using actual font families in our pleadings, the roots of my
Matthew Butterick (09:06):
Labor are finally being realized. So yes.
Zack Glaser (09:10):
Well, I really would love to chat with you about that for a while. But what we really wanted to talk about was the generative AI suits that you have and your co-counsel have going on against a large group, large swath of different types of generative ai. And what I think might be helpful initially is to talk about, if you wouldn’t mind describing what that really is, generative ai. What is it these things are doing in common that have caused the issues broadly, I guess?
Matthew Butterick (09:43):
Sure. Well, the first case that I was involved with, which was filed back in November, was the case against Microsoft and OpenAI for this product. GitHub and co-pilot is an AI assisted, I guess you could call it software editing tool. What you do is you open up your software code editor and you type some code like you, a function that you might want to write. I’m writing a function to test if a number is Prime, and what co-pilot will do is suggest the rest of the code that you might have written. So it will, as soon as it sees the Oh, Butterwick is trying to write a function to test ality, here’s the body of a function that might work. And then you can accept or reject that suggestion. So essentially this idea of generative AI is in the co-pilot case, it’s generating software code that you can incorporate into your own own document.
And in the case of stable diffusion, that’s probably a much more popular product because I say software’s a little nerdy, but a lot of people have used the image in. Well, it’s been amazing to see the really global attention, not to the case itself, but just to stable diffusion. And because there’s just millions and millions of people using it and the controversies around it, which again, we did not, the Lawyerist didn’t start a stable diffusion for your listeners who haven’t heard is again, an AI image generator. You step up and you put in a what’s called a text prompt, a line of text, maybe two sentences or so, describing what you want to see. For instance, in our complaint, we talk about a puppy dog wearing a baseball cap, eating an ice cream cone or something. And then again, the AI system tries to generate an image that matches that prompt, and sometimes it does well, sometimes it doesn’t do so well, but there you go.
That’s what makes it fun for people. And in both cases, what we are claiming, however, is that the link between what comes out of the system and what goes into the system is the real issue because the GitHub co-pilot has been trained on billions of lines of open source code, most of which is available on this GitHub website. And similarly, the stable diffusion AI system has been trained on billions of images found out on the internet, many of which are copyrighted. So there are some pretty difficult, interesting and novel questions about whether the way this training material is being used really complies with the law. And that’s what these cases are setting out to discover. As far as we know, these are the first two lawsuits about generative AI in the United States, in the world, anywhere in the cosmos.
Zack Glaser (12:31):
They’re certainly the first two that I found.
Matthew Butterick (12:33):
What other alien species have already invented generative AI and already done these cases, so they’ve already worked. It’d be cool to find out
Zack Glaser (12:40):
If we had jurisdiction.
Matthew Butterick (12:42):
Zack Glaser (12:42):
Yeah, if we could just get into separate jurisdictions. So I think to back up a little bit, some of the guts of these complaints, the cases are that this ai, these models have to be trained on something and like you said, kind of billions of images or billions of lines of code, and this code or images, they have to come from somewhere. And so for the example with co-pilot, it comes from the GitHub repository. And just to kind of explain that, what exactly is the GitHub repository for people that don’t have familiarity with, or actually, it might be easier to explain what the Deviant Art repository is for the image case instead.
Matthew Butterick (13:28):
Deviant Art is a website that’s been around, as I understand it, about 20 years and has been a community of visual artists and been a place where they could make their own art and put it up and share it with fans, other artists and so forth. In fact, it’s somewhat similar to GitHub, which for many years, not 20 years, but I think 10 or 15 years, GitHub was started, which GitHub became sort of a crossroads of the global open source community. And software programmers would put up the word you use repositories that sort of represents a single project on GitHub where you might store your open source code. So both GitHub and DV and Art were these communities where people would share their work. That’s right. But the thing is on GitHub, every repository or a lot of them are covered by an open source license, and an open source license is not some sort of joke.
It’s a real legally binding contract. They’ve been litigated for many years. That’s like they’re the real deal. And what sort of trips people up is they hear open source software, isn’t that the software that’s free and that you can get a copy of it and you can modify it? That’s true, but you do have to follow the provisions in the license. And when you use that open source code, for instance, some of the provisions in the license are that you have to preserve attribution to the original author of the code. You have to preserve the author’s copyright notices. You have to often include a copy of the license from that code. So if you don’t do these things, you are not entitled to use the code at all. Right, right. It’s almost as if you’re, you’re just pirating many other software. So that’s been a big issue because with the co-pilot service from really the first week it was launched in a beta format, people were noticing that it would emit big chunks of recognizable code from open source repositories and it wouldn’t carry the licenses that code had or it would emit an incorrect license.
So there started to be questions from the programmer community right away. And again, a similar thing with deviant art and artists worldwide seeing these AI generators in stable Diffusions case, they use this data set called Lyon, which is a list of URLs developed by a German consortium. But the point is that so we know exactly what’s in the training set because that’s open to inspect. And artists went into this training set and found out, oh, my art’s in there. I’m part of the training set. Well, nobody asked my permission, and again, what’s happening to this art? How is it being emitted on the other side? So again, the AI companies so far have sort of stepped up with these services and said, well, we think we’re entitled to do this under fair use is something we hear a lot about. And I think that’s going to be a big question in these cases is whether fair use really does allow them to do what they’re doing.
Zack Glaser (16:22):
That’s ultimately at least one portion of these matters. What it goes down to is that idea of fair use because it does have a little bit of the idea of sampling in hip hop, the idea of sampling some music that is already there and you bring it into something that you are doing because as you said, or as you guys have said, these complaints, the system is not creating anything really whole cloth. It’s going into these photographs or going into these repositories and grabbing pieces of them and putting them together somehow, right?
Matthew Butterick (16:58):
Well, certainly that’s our claim in these lawsuits, and this is not an opinion that everybody on the internet shares. And we know that there’s certainly a lot of opinions about what’s really going on under the hood of these systems. And part of the problem is there isn’t a lot of transparency really yet what we know about, for instance, the dataset used by stable diffusion, but we don’t know everything that they’ve done with that dataset and how they reached the product that they have. And the same thing with Microsoft and OpenAI. They’ve published papers about co-pilot and CodeDX, which is an underlying technology that’s in co-pilot. So we know a little bit about it, but we don’t know everything about it. So that’s part of what the litigation is about, obviously is finding out more. And I think that’s going to be really essential to determining the legal ramifications is we need to really inspect these systems close up.
I would say as a lifelong technologist, one of the frustrating parts of this who debate has been the people who will say, oh, but what it is is a magic black box. And it’s like little bits of brain sauce floating around and they really say things like this and you say, no, it’s an artifact of a software program. We can understand it, we can inspect it, we can interrogate it, and we need to look what happens to that data going in, trace it, see what happens to it. That’s when we’re going to find out what’s really going on. And it’s a necessary step.
Zack Glaser (18:27):
I think that’s a funny or interesting kind of juxtaposition of this idea of open source and then these things that are using this open source that are making themselves into a black box, open source. Obviously it doesn’t mean free software. It means that we have opened up the source of this software for its inspection. And it’s interesting to me that that’s one of the kind of issues here. Is that potentially one of, or one of the initial solutions in this kind of AI sort of area of do we need to make these data sets inspectable? Is that a word, able to be inspected by somebody, whether it’s publicly inspected or inspected by some sort of agency that is tasked with that?
Matthew Butterick (19:16):
I think that’s a great question. I mean, just in the same way that we have nutritional labels on food so that people can pick it up and then see what’s in it, right? I think that there’s going to be some interesting questions about whether AI products are going to end up with some kind of similar labeling, not just for fairness and ethics, though. Those things are also good, but again, for the legality of the situation with co-pilot and with the image generators, and now because we’re on Lawyerist, we can get a little nerdy about the law folks, if you want to have a laugh, go and look at the terms of service of these things and see what they say about the legal status of the code that comes out or the images that come out. It’s very nebulous. Exactly. Really, these companies really resist making any claims about do you own it?
Is there a copyright? Is there this, can I use a commercial? We don’t really know that’s going to be kind of your problem. And I just kind of feel like long term just that is not sustainable and that we can zoom out and we can start to see a macro question of how are AI systems going to be made compatible with the rule of law? And I really, because as you say, we can’t just say that now in our culture and society, we have these kind of huge, expensive black boxes of software that we’re entrusting with all these decisions and well, we just don’t know what they do and it’s fine with us. That doesn’t work. There’s nothing else like that in human society. Everything has to come back to a human agency and accountability, and maybe, I don’t know if it’s a scary thing, but certainly even these wonderful AI researchers who work on these projects sometimes reach a point of saying, well, we don’t even know completely how they do what they do, but I think somebody’s going to have to know how they do what they do, or they’re going to be limited in how far they can be trusted and used.
Zack Glaser (21:07):
I think ultimately that’s part of the issue here. Yes, the specific question in these cases, or at least one of the specific questions in these cases is a fair use question. And I think that’s a legitimate question. How much of something does one have to use before they have to make attritions? I think obviously another issue in these is literally the GitHub’s terms of service initially when people put their information onto the service. But broadly, we do have to figure out some way of dealing with these machines that are able to process more information than we as human beings can. I can’t go in and fully actually inspect the data sets that these are using as a human now. I mean, I could use another tool to do it, but I couldn’t do that as a human. And I think that’s scary to people, but it’s also we have to shift our mindset of how we are regulating things, how we are dealing with some of these pieces of software.
Matthew Butterick (22:13):
I think you’re right. And by the way, I want to interject to whoever’s listening to this podcast and thinking, how did Mr. Typography for Lawyerist end up on these cases? It’s a great story. I’m not doing it alone with a wonderful lawyer named Joseph who’s got his own law firm up in San Francisco, antitrust class action. He’s been practicing for 30 or 35 years, just a wonderful guy. And the great thing is that Joe has been a fan of my book and my fonts for many years, and we kind of corresponded when he set up his current firm. So when I did a blog post about GitHub co-pilot about six or eight months ago, he saw it and he said, this is really interesting what you’re talking about. Maybe we should investigate this more. And Joe, if you know were here, he would be telling us that this is something that he’s seen over and over in decades of litigation against the tech industry.
I mean, this new technology surfaces and people want to push it really far, really fast as soon as they can. And we talk about this also as for your listeners who remember Napster, right? 20 or 20 odd years ago, which kind of birthed the internet music streaming idea. And everybody was loved it because it nothing, it existed, of course, it was also completely giant copyright infringement all the time, but two camps emerged, right? There was the camp of people who said, oh, you can’t put the genie back in the bottle streaming music’s here to stay, like, stop shouting at clouds, do old bloods. And then there were people who said, yeah, but it’s copyright infringement. And the thing is both people ended up, both groups ended up being right because right though Napster was sued into oblivion, it cleared the decks, and after that, we had companies like Apple and Spotify and others say, Hey, we’ve got a better idea.
Let’s bring the rights holders in. Let’s make deals with them. I know some people would say the deals were not great, but at least they brought the rights holders in and they said, let’s make this service legal and fair. Well, again, we could debate about the fairness. At least it was legal though, at least it was ethical. They went out and got their data from the people who owned it. They got consent, they got compensated, and I feel like the streaming services we have now are a lot better than Napster. So I also feel like, I mean, I’m interested in ai. I think it’s wonderful. I think it is going to be a big part of the technology scene, and we just sort of have to get through this, whatever you want to call it, this early Napster style phase because I really think that everybody is going to have that dawning realization that it’s better for everyone. It’s not just a matter of being nice to artists. It actually lets you make better products with stronger guarantees for your customers. Everybody’s going to like these services better when they’re built on whatever it is, licensed data, ethically sourced data.
Zack Glaser (24:55):
This isn’t just about let’s stop ai. We’re not smashing the looms right now. It’s
Matthew Butterick (25:03):
Not at all about about stopping ai. And again, this is something that, I mean, somebody sent me, a friend sent me a tweet where someone was claiming that I personally was going to bring down the global geopolitical order because Butter Ex just wants to stop all ai, and that means it’s going to hand China this insurmountable advantage, and the United States is now over and it’s butter ex’s fault. Really? That’s not what I want. If that was you that tweeted that, it really please,
Zack Glaser (25:31):
I’m amazed that we got somebody that powerful on the Lawyerist podcast. I’m just happy to be here at this point
Matthew Butterick (25:39):
Sometimes. Yeah, yeah. Sometimes Zack the anger that surfaces when you talk to people that maybe their AI tool is not quite legal. It’s interesting. I can’t quite put my finger. Do you have any sense of that why people get bent out of shape when you say maybe your toy will not persist?
Zack Glaser (25:57):
I think that’s an interesting thing to talk about, and I’d like to grab that thought. When we come back here. We’re going to take a word from our sponsors real quick, and then we’ll be back with Matthew Butrick talking about why we get Ben out of shape about somebody kind of stopping our toys from moving forward.
The Lawyerist Podcast is brought to you by Posh Virtual Receptionists. As an attorney. Do you ever wish you could be in two places at once? You could take a call while you’re in court, capture a lead during a meeting or schedule an appointment with a client while you’re elbow deep in an important case. Well, that’s where Posh comes in. They’re a team of professional, US-Based, live virtual receptionists available 24/7/365. They answer and transfer your calls, so you never miss an opportunity and you can devote more time to building your law firm. And with the Posh app you’re in total control of when your receptionist steps in. You can save as much as 40% off your current service provider’s rates. Even better, Posh is extending a special offer to Lawyerist listeners, visit posh.com/lawyerist to learn more and start your free trial of Posh Live Virtual Receptionist services.?
And by Clio. What do solo and small firm lawyers with great client relationships all have in common? They use cloud-based legal practice management software to run their law firms. This is just one finding from Clio’s latest Legal Trends Report. There’s no getting around it… the fact is… when it comes to client expectations—standards are higher than ever for lawyers. Proof is in the numbers: 88% of lawyers using cloud-based software report good relationships with clients. For firms not in the cloud, barely half can say the same. That gap is significant. For more information on how cloud software creates better client relationships, download Clio’s Legal Trends Report for free at clio.com/trends. That’s Clio spelled C-L-I-O dot com / trends.
And by Gavel. In the next 10 years, 90% of legal services will be delivered online by lawyers. Gavel (previously called Documate) is the software platform for lawyers to build client-facing legal products. With Gavel, collect client intake, feed that data into robust document automation flows, and collect payments to scale your practice. Companies like Landlord Legal, JusTech and Hello Divorce are built on Gavel—for both internal and client-facing automation. Sign up for a free trial now at gavel.io/partnership/lawyerist and get $100 off your subscription. Or book a time at gavel.io/partnership/lawyerist to get a free consultation on incorporating automation into your practice.
w, it feels like why I had it. I mean, I feel like the reaction to something like chat GPThas been interesting and a little more, how shall we say, mixed.
And I feel like I wonder if that’s because people tend to have more experience, like firsthand experience as writers, so they can feel like they know what it means to generate their own words and be responsible for it, whereas many fewer people have the experience of generating their own full color or groovy images. So I think the way that chat, GPTkind of weaves things together and doesn’t tell the truth, that hits people in a different way because again, that’s a skill that they do have, and they can reflect on how it feels. So I wonder if sometimes it’s like, and that’s how artists feel when they see these systems being used. And what, for me, again, as a designer and programmer, the big problem with these systems, again, it’s not ai, it’s like, I just think it’s just not fun, mean to me, making things is about learning things and the process.
It’s not just about getting a result. That’s almost the tiniest slice of it. So it’s interesting to sort of go on to social media threads and see people exchanging tips for, it’s like, oh, well, I spent 10 hours putting prompts into mid journey. It’s like, why not spend 10 hours actually learning how to draw? I mean, I’m just kidding. You’re going to put in the time, why not? But it’s like all the people who 15 years ago put all that time into learning to play guitar here really well, if they had just put it into guitar, that would be a skill that they could continue to enjoy.
Zack Glaser (32:28):
Absolutely. But I do think that there’s a level of abstraction that comes when we do these things where now we as users, as photographers don’t necessarily have to think about F-Stop and all the things that one would think about in A S L R camera, non-digital s SLR camera. And yes, we don’t necessarily have that skill, but theoretically there is a derivative sort of skill that comes out of that. And I wonder if that’s something that we’re moving toward or that people are excited about. Okay, yes, I don’t know how to play the guitar, but I do know how to play Guitar Hero, which means that I can take digital guitar riffs and put it onto TikTok, let’s say, you know, can see how creative I am. So I wonder if that’s, is it almost inevitable that these things happen? We move and put more abstraction on top of the art, the creativity that we do and have?
Matthew Butterick (33:29):
I think the answer mean, that’s a great question. I think the answer is there’s, there’s going to be different tools for different sort of people. I also, I don’t do it professionally, but I do music in my spare time and I do some recording on the computer and you know, can do whatever you want. If you want to set up a drum kit and record it, you can do that. Or if you want to download loops of beats and put that into your track and use it that way. So if you want to make it easy, you can make it easy. But here’s the thing, if you want to use those loops, they’ve all been licensed. They’re all in the clear. So I think we just kind of keep coming back to this issue where if people want image generators, I think they should have image generators. I just think that the image generator that they have should have fairly unethically sourced data and best for everybody.
Zack Glaser (34:13):
That does bring me back to that Napster example though. Is there really a way that we can have fairly unethically sourced data because I’m from Nashville, artists that create songs do not get paid enough. Don’t artists that record songs do not get paid enough artists that record songs that go onto films and things like that, they don’t get paid enough. And I don’t think they got the proper end of this bargain. And is there even a way for the creators in GitHub or elsewhere to get the proper end of this bargain?
Matthew Butterick (34:48):
Well, that’s a great question. We alluded to earlier when the streaming services, the legal streaming services arrived, deals were made for the recordings. And a lot of musicians said, wait a minute, streaming a song is a lot worse than selling a song. And I don’t know mean if you’re asking, if we have AI licensing, will it really be worth it? I it can’t be a sort of situation where it can be done. That’s the thing. I mean, I think, yes, there’s this company, Shutterstock, you can fact check me on this. I believe Shutterstock, the stock image company just announced an image generator that is only trained on Shutterstock images. Now, to me, that’s a great idea here, because Shutterstock has already got rights to these X number of images, and they’re putting that in as the training data. And then they’re saying to customers, well, when you generate an image with this, whatever comes out, I think they’re saying you can use it on the same terms that you could use any other Shutterstock image.
So I think that these companies can definitely make these deals if they want to. The thing is, it takes time and money and they’d rather not. I mean, you alluded earlier to GitHub’s terms of service a year and a half ago or whatever, when the preview version of copilot first came out, folks online put the question to the current or the then CEO of GitHub co-pilot, they said, why do you think you have the legal right to use all this code? And he didn’t say, because of the GitHub terms of service. He did not say that. He said, because of fair use, he said, it’s fair use. And the question is, why would he say it’s fair use? And I think there’s a pretty obvious answer. It’s because Microsoft and GitHub’s appetite for data over the long term is not just what’s on GitHub. It’s everything in the dataverse, and they want it without consent, and they want it for free, and they want it forever. That’s what they want. And the only way they meet all those tick boxes is if it’s fair use, right? Because that’s the magic ticket. So I think that it’s, well, really, and I think that’s all these kind of first generation AI companies are stepping up and saying, yeah, we want it to be fair use and all right, well, we got to have that, that conversation then. Well,
Zack Glaser (36:59):
I think, and I’m probably going to go way too long with this interview, but I think by opening up this door, the question then is what’s fair use? What’s the point of fair use? The point of fair use is to allow people to create derivative works in my mind, to use some sort of creativity to put two things together. And at this point, an AI model actually have creativity to put two things together. Yes, it’s putting, it could potentially be putting two things together, but then you go with who’s fair use? Is it co-pilot’s fair use, or the user of co-pilot’s, fair use. So with fair use, we’re trying to say, yes, we want people to be creative, and we don’t want people to be able to just jealously guard their own art, but there has to be some sort of creativity on top of that.
Matthew Butterick (37:48):
Yeah, I think that’s right. And I think those that say copyright isn’t supposed to be completely a way for owners of artworks to be gatekeepers is true. I mean, fair use is important. It’s there for a reason. It’s there to enable these new uses, though I think one thing that you see in pretty consistently in the discourse about fair use and maybe many things, but the one kind of red line is when you’re making something that really substitutes for the work that you are, you’re using, and I think that’s a big question in these systems. I mean, generative AI just kind of brings that question to the fore, which is, if these systems are designed to be substitutes for their training data, what is that going to mean legally? I real really, I mean, yeah, folks have folks contact me and say, oh, we did this case before butter.
It was called Authors Guild versus Google, or it was called Perfect 10, and those cases happen. But again, I think there was something a little bit different about copyrighted materials being harvested for their metadata to make some kind of search index. If it’s being used for search index, index, excuse me, the search index is pointing at that underlying work, right? For instance, if you use Google today and you use their image search, it brings up all the thumbnails and you know can click on them and it says, this is where this image is and this image, and it tells you where you can find the real image. But when you use an image generator, an AI image generator, it doesn’t point to anything. When you ask for a dog with a baseball hat wearing eating an ice cream cone, it doesn’t say, so here are some images that are like that.
No, it doesn’t do that. It just makes its little collage, it’s pastiche and presents it to you as a feta co. And of course, as we know, you and your listeners may have seen there was this opinion from the US copyright office a week or two ago about first big opinion, I think they’ve released on the copyright ability of AI generated images. And they said, given the current state of image generators, no, they’re not copyrightable. And their reasoning is the person operating the image generator isn’t contributing enough originality. I think they likened it to simply giving instructions to a designer or something, and there’s not enough specificity. So I think that’s interesting. But of course, these AI systems are going to get better and the territory is going to keep changing.
Zack Glaser (40:11):
Absolutely. Well, I think that’s actually probably a pretty good place to kind wrap up with. We don’t know, at least that’s where I’m, but I think that’s the idea. The big idea behind making these lawsuits happen is making sure that we do answer these questions. Obviously these questions need to be answered. I think
Matthew Butterick (40:32):
Yes, as my co-counsel Joseph area has said on occasion the dry delay that litigation can be a blunt tool, but it can be effective. I mean, one of the frustrations is, of course, a lawsuit is going to take years. And in the interim, yes, there’s going to be so many changes in AI and so forth, but any kind of regulations, legislation, what have you, are going to take even longer. So this is where we’re starting. And by the way, I mean these cases that we’ve brought, we’re just starting to see cases. The Getty images, the stock image company has brought its own suit against stability recently in San Francisco. Suit was filed against, I believe, the Lens app for violations of biometric protection law. So there’s going to be all kinds of ways in which these services are challenged, and it’s, again, litigation is only going to be part of it, right? We’re going to be in the midst of a broader, how shall we say, social conversation about how we want to integrate these systems into our lives. And a lot of that is just be creating new norms of moral and ethical behavior in terms of what we think is right and fair. And that is going to be the sort of the main thing the law is, is going to be the backstop.
Zack Glaser (41:46):
It’s a good point, yes. Where does our society go with this and where do we broadly think that this is fair? What do we want it to do? Well, Matthew, I really appreciate your insight into this, and if anybody wants to see the websites on the GitHub co-pilot litigation, the Stable Diffusion litigation, honestly, I would suggest you get a look at the websites because they’re really, really sharp looking websites. So even outside of this litigation, it is, in my opinion, a good example of how to create a website for a specific purpose, and they’re wonderful. So one is GitHub co-pilot litigation.com. The other is stable diffusion litigation.com. So check ’em out.
Matthew Butterick (42:32):
Thank you. That’s what happens. Yeah, when you take a typography and let him start filing lawsuits going to make, no, my favorite comment on the websites was like, what’s that font butter ex using? And the truth is, it’s an unreleased font that nobody’s ever seen before. So you know what? Everything is a platform. It’s plans within plans, man, we keep doing the typography. We do the litigation, so let’s keep ramping it up.
Zack Glaser (42:54):
Fantastic. Fantastic. Matthew, once again, thanks for being with me. I appreciate it, and we’ll hopefully have you on sometime soon again.
Matthew Butterick (43:01):
Of course. Thank you, Zach.
The Lawyerist Podcast is edited by Britany Felix. Are you ready to implement the ideas we discuss here into your practice? Wondering what to do next? Here are your first two steps. First. If you haven’t read The Small Firm Roadmap yet, grab the first chapter for free at Lawyerist.com/book. Looking for help beyond the book? Let’s chat about whether our coaching communities, are right for you. Head to Lawyerist.com/community/lab to schedule a 10-minute call with our team to learn more. The views expressed by the participants are their own and are not endorsed by Legal Talk Network. Nothing said in this podcast is legal advice for you.
is the Legal Tech Advisor at Lawyerist, where he assists the Lawyerist community in understanding and selecting appropriate technologies for their practices. He also writes product reviews and develops legal technology content helpful to lawyers and law firms. Zack is focused on helping Modern Lawyers find and create solutions to help assist their clients more effectively.
Matthew Butterick is a writer, designer, programmer, and lawyer.
Last updated July 6th, 2023