Empirical Studies of AI Lawyering (e-SAIL)

From Fiction to Fact: Progress on Legal Hallucinations

Johanna Schandera and John Bliss 8/25/2024 The debate over generative AI’s reliability in legal practice has focused on “hallucinations”—instances where an LLM produces text with incorrect legal facts. A study of last year’s general-purpose chatbots found hallucinations in 69 to 88% of legal responses.[1] This seemed an alarming finding, though the study was not examining the best legal AI tools available (post). A new study by the same researchers provides an updated assessment of the AI applications that are going…
read more
The Ethics of GenAI Lawyering

John Bliss 4/2/24. Many lawyers seem wary about the idea of incorporating generative AI in their practice, often citing uncertainties about how the rules of professional responsibility might apply in this novel context. A leading national figure in bar regulation around emerging tech, Andrew Perlman, has written an authoritative article on this topic (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4735389), which should assuage some of these concerns. Perlman suggests that the rules of professional conduct do not prohibit use of this technology, although lawyers should be…
read more
B+ on Autopilot: LLM Achieves new Grade on Unassisted First Drafts

John Bliss 2/16/24. A new study from a group of law faculty at the University of Maryland provides an update on LLM performance on law school exams. The lead author conducted a similar study last spring (2023), finding that GPT-4 scored as high as a B. The new study finds that GPT-4-Turbo scored as high as a B+ on Fall 2023 exams (in civil procedure, torts, and environmental law), although it scored lower in other classes, achieving a mean and…
read more
Lawyers Replaced in Contract Review?

John Bliss 2/14/2024. A new study finds that large language models (“LLMs”) perform contract review at near human-level accuracy, while dramatically cutting the required time and cost. This could suggest—as the authors conclude—that junior lawyers and LPOs are on the verge of radical disruption and even some degree of replacement. However, the article is very light on statistical reporting, casting a heavy cloud of doubt over the study’s implications. Key Findings Regarding accuracy, the researchers found that GPT-4 scored slightly…
read more
Anti-hype

John Bliss 1/28/24. A position paper argues that GPT-4’s performance on law exams does not provide evidence that AI is “set to redefine the legal profession.”[1] In a Substack post, the authors summarize their position, as follows, “Will AI transform law? The hype is not supported by current evidence.”[2] I think the authors are right to counsel uncertainty, but they may go too far in their “anti-hype,” downplaying (and misconstruing) the current state of empirical research on legal AI. The…
read more
Hallucinating about Legal AI hallucinations

John Bliss 1/19/24. A new study suggests that generative AI’s legal hallucinations are “alarmingly prevalent,” providing “error-ridden legal answers” to 69-88% of legal questions.[1] But this finding is misleading and has been widely misinterpreted. The study itself is rigorous and has important implications. Yet, the response to the piece in mass and social media has almost universally missed a key point: this study is focused on last year’s technology, not the leading AI applications that are going mainstream in the…
read more

Charting the course toward artificial legal intelligence...

e-SAIL summarizes and reviews leading research on AI’s advancing legal capabilities.

From Fiction to Fact: Progress on Legal Hallucinations

The Ethics of GenAI Lawyering

B+ on Autopilot: LLM Achieves new Grade on Unassisted First Drafts

Lawyers Replaced in Contract Review?

Anti-hype

Hallucinating about Legal AI hallucinations