Can ChatGPT replace a lawyer?

After first entering the mainstream late last year, ChatGPT has taken the world by storm. It’s hard to go a day without ChatGPT trending on Twitter or some news article claiming ChatGPT will replace workers in X industry. 

As a law school graduate, I paid particular attention to the articles about ChatGPT replacing lawyers. I’m sure you’ve seen them too. If not, here’s a summary of some of the most popular stories about ChatGPT and the law.

In December, researchers for ChatGPT said that GPT-4, the next version of ChatGPT’s software, could pass the bar exam. Then in January, University of Minnesota legal professors said ChatGPT passed law school exams.

This month, a thread from Greg Isenberg, CEO of Late Checkout, went viral. In it, he said he used ChatGPT instead of a lawyer to get $109,500 from a client who was late on their payments. 

I took a closer look at some of these claims, and honestly, I was unimpressed. Many of the stories either misrepresent what ChatGPT is capable of or misrepresent what lawyers actually do.

Because I had my doubts about ChatGPT’s legal abilities, I decided to run some tests of my own. After testing ChatGPT’s usefulness as a legal researcher and writer and its argument skills, I can say for sure that ChatGPT won’t be replacing lawyers any time soon. 

First, let’s debunk some ChatGPT myths, then we’ll get into why I think ChatGPT won’t be taking my job (not yet at least).

Debunking myths about ChatGPT’s legal skills

To start with the claim that GPT-4 could pass the bar exam, this could be true in the future, but we just don’t know yet. Here at Humanoid, we follow the tech industry closely, so we are very aware that tech CEOs have an unfortunate habit of overselling their products. 

Hyperloop expectations vs reality

We certainly hope GPT-4 is capable of what its creators think it will be able to do, but their predictions shouldn’t have created as much stir in the media as they did, especially when you consider how ChatGPT and GPT-3.5 actually performed on legal exams and the bar.

Taking a closer look at the University of Law studies show that, yes, ChatGPT did pass some basic law school exams. What a lot of the headlines leave out, however, is that ChatGPT barely passed and that its results were widely inconsistent. 

Researchers described its performance as “mediocre,” especially with the more complicated, less popular topics of torts and tax law. 

When ChatGPT was given multiple-choice bar exam questions by professors from the Chicago-Kent College of Law and the Michigan State University College of Law, it performed poorly. ChatGPT failed the bar exam in this study by answering just 50% of questions correctly. For context, the average human correctly answers 68%.

Finally, let’s get into Greg Isenberg’s experience with ChatGPT. Isenberg is the CEO of Late Checkout, which is a product design agency. Isenberg claims a Late Checkout customer was five months late on a $109,500 payment. 

Running out of options, Isenberg turned to ChatGPT to write a “scary email.” ChatGPT did this for him, and after some tweaks from Isenberg, he sent it to his client. The client responded in two minutes with a plan to repay Isenberg. 

Isenberg boasted about this experience, specifically saying that it saved him legal fees. This honestly doesn’t make any sense to me. Why would the legal department be involved in this at all unless Isenberg actually wanted to bring his customer to court over the unpaid $109,500.

The letter itself mentions no legal action that would be taken, and Isenberg admitted in his tweet that it was his finance department that was handling this issue before he got involved and not his legal department. Additionally, Isenberg sent the letter himself, which is again proof he didn’t need a lawyer to resolve this dispute. 

Basically, ChatGPT didn’t do anything special. Isenberg could’ve just as easily used a template he found from a basic Google search. A lawyer wasn’t necessary for anything he did, so it’s wrong to present his experience as one that saved him legal fees by using ChatGPT.

Now that I’ve explained why others might’ve been overstating ChatGPT’s legal abilities, let’s get into my own assessment of what ChatGPT is capable of. 

ChatGPT screenshot "How do you think ChatGPT could help lawyers?"

Can ChatGPT help lawyers with legal research?

As you can see from the image above, ChatGPT thinks it will be useful to lawyers because it will save them time on legal research. As someone who has performed tons of legal research, I was praying that ChatGPT was right and that it could quickly find me cases that I would’ve spent hours finding for myself otherwise. 

In the last year, I wrote two dissertations about complex legal issues, one being about the National Environmental Protection Act (NEPA), hence why I grilled ChatGPT on the topic. 

ChatGPT screenshot answering legal question about NEPA

I was perhaps too tough on ChatGPT right away. I asked it a difficult (but definitely not impossible!) legal question. This was one of the questions I spent hours researching, so I was hoping it would give me a list of cases that I eventually found. Instead, it told me it was just an AI model and couldn’t help me. Alright, fair enough. 

From there, I asked it for cases involving a common aspect of NEPA. This is where I started to question ChatGPT’s usefulness as a legal research tool. ChatGPT produced two cases: 1) Sierra Club v. Marsh, 816 F.2d 1376 (9th Cir. 1987); and 2) Forest Guardians v. U.S. Forest Serv., 329 F.3d 1089 (9th Cir. 2003). ChatGPT then provided a paragraph explanation of each case.

These two cases are very real cases, but ChatGPT’s explanations were completely fabricated; the cases weren’t even NEPA cases. Other cases that involved NEPA were referenced within both cases, but both cases themselves were disputes involving the Endangered Species Act (ESA).

Fabricated ChatGPT answer

Admittedly, the lines of argument can seem similar, but the facts presented by ChatGPT are completely made up. There is no reference to tiering or inadequate PEAs in Sierra Club v. Marsh

After this failure, I asked ChatGPT to describe the most important NEPA cases. It thankfully gave me real cases, but again was wrong on some details and even got basic facts wrong, like what year the case was decided. 

In doing my research for this article, it turns out I’m not alone. ChatGPT has a huge problem of making up sources, especially legal cases. This phenomenon is known as AI hallucination, something that ChatGPT is definitely prone to.

Remember how ChatGPT passed law school exams? The professors had to specifically instruct ChatGPT to not fabricate court cases. This key detail was included in the University of Minnesota study, but it was left out of every article I read about ChatGPT passing these exams. Instead, the media said the study was proof ChatGPT could replace lawyers.

Until ChatGPT can learn to reliably produce real cases, not only will ChatGPT not replace lawyers, I’m not even sure ChatGPT will be a useful tool for legal research.

Can ChatGPT handle a courtroom?

After ChatGPT’s poor performance as a legal researcher, I have to admit I was pretty demoralized about this whole exercise and didn’t really want to write any more. Despite that, I thought it would be fun to do a little roleplay exercise with ChatGPT. I played the judge and ChatGPT played the advocate. 

I first asked ChatGPT to write a legal memo given the facts I presented to it. The memo wasn’t great, so I asked it to rewrite the memo using one of the two widely accepted memo writing formats, but ChatGPT didn’t understand that format, so gave me another bad memo. I decided to move on from this and get to the mock trial.

ChatGPT again failed to live up to the hype. It regularly made up legal standards. At first, I tried to be accommodating and ask if it was sure about the legal standard it just stated. ChatGPT wouldn’t respond saying it was wrong or right. Instead, it would just explain the legal standard, even if its new explanation was completely contradictory to its last explanation.

ChatGPT screenshot about CEs in law
In the previous two answers, ChatGPT had said that there was both a public comment period and that there wasn’t a public comment period. I prodded gently for clarification. 

I wasn’t getting anywhere with this strategy, so I decided to play a more combative judge. I directly countered ChatGPT’s arguments directly until it eventually gave up the exercise and gave me a default answer about being an AI model that isn’t always perfect.

To give ChatGPT credit, it knows its limitations. If you ask ChatGPT if it can be used to help trial lawyers, it will tell you that it can’t “replace the critical thinking and legal analysis skills of a trial lawyer.” 

ChatGPT screenshot saying it can't do legal work

For now, I’m going to have to agree with ChatGPT. ChatGPT won’t be replacing lawyers for a very long time. If ChatGPT ever gets good enough to replace lawyers, trial lawyers will be the last to go. 

Will AI ever replace lawyers?

Whether AI will ever replace lawyers is something that’s unfortunately impossible to answer. I personally think it’s possible but it will be far down the line. In my opinion, the legal field will be one of the last fields to fall to AI.

Jokes about lawyers being bloodsuckers who are only in it for the money often hide the fact that being a lawyer requires a huge amount of humanity. Lawyers need to work directly with clients to satisfy their needs. 

In family law and criminal law contexts, lawyers often go above and beyond to make their clients comfortable. These are things AI can’t do.

If AI programs, like ChatGPT can improve their factual accuracy, they may be a useful tool for legal research. I can even see a future where ChatGPT is incorporated into databases like LexisNexis or Westlaw. Outside of those purposes, we are going to have to wait a long time before AI is a useful tool for lawyers.

Leave a Comment