Newsletter

The State of Legal Industry AI Benchmarks in 2025: What Lawyers Should Know Before Choosing AI Tools, Part Three

Image of desk and keyboard, text reads: The State of Legal Industry AI Benchmarks in 2025: What Lawyers Should Know Before Choosing AI Tools, Part Three    Good Journey Consulting Newsletter Issue 47

Two Announcements:

First, Good Journey Consulting welcomes Depo IQ to the AI Access to Justice Initiative! Depo IQ pledges to provide complimentary deposition summaries to legal non-profits/legal aid organizations. Qualified organizations can receive up to five complimentary deposition summaries, depending on the size of the organization. Interested attorneys and organizations can get in touch with Depo IQ via [email protected].  

Second, A Lawyer’s Practical Guide to AI is turning one and we are celebrating with an anniversary sale. This is the first sale since the launch of the guide, so if you’ve been thinking about grabbing your copy, now is the time. Use code 25OFF for 25% off. The sale ends November 19, 2025. Grab your copy of the guide here.

Issue 47 

This week I have an unexpected Part Three to add to my recent series on AI benchmarks and what lawyers should know before choosing AI tools. The first two parts of the series explained six independent legal industry benchmarks and evaluations, and addressed what they reveal about AI tools for lawyers, and what lawyers should keep in mind when they interpret a benchmark or evaluation. Here are links to access Part One and Part Two. Below is a summary of an additional independent benchmarking study that was released in the last few weeks.   

VLAIR - Legal Research

In October 2025, Vals released VLAIR – Legal Research, an extension of its earlier benchmarking effort titled Vals Legal AI Report (“VLAIR”), which was summarized in Part One of this series.[i] VLAIR - Legal Research evaluated three legal industry AI tools (Alexi, Counsel Stack, and Midpage), as well as ChatGPT, and a human baseline consisting of a group of lawyers from the same firm who had experience conducting legal research.[ii] The study involved 200 legal research questions.[iii]

The AI tools and the lawyer baseline were each given a weighted score, with 50% of the score given to accuracy, while 40% was given to authoritativeness, meaning whether the response was supported by citations to proper sources, and 10% of the score was given to appropriateness, meaning whether the response was easily understood and could be shared as-is with others.[iv]  The study found that the legal industry AI tools received the highest weighted scores, ranging from 76% to 78%, followed by ChatGPT at 74%, with the lawyer baseline scoring the lowest at 69%.[v] Counsel Stack had the highest score of the legal industry AI tools.[vi] 

Interestingly, the study found that when the AI tools outperformed the lawyer baseline, they did so by a large margin.[vii] Of the 200 questions included in the study, AI tools outperformed the lawyer baseline on 150 of the questions, and the average point margin was 31%.[viii]  In contrast, when the lawyer baseline outperformed the AI tools, it was by an average point margin of 9%, and typically involved questions concerning complex multi-jurisdictional analysis, judgment-based synthesis, or when a deeper understanding of context was necessary.[ix] These conclusions can serve as helpful guidance to lawyers who are interested in using AI for legal research, to better understand the circumstances when AI-assisted legal research might be most effective. You can read VLAIR - Legal Research in its entirety here

Are the findings from VLAIR – Legal Research consistent with the other benchmarking studies? 

Yes, VLAIR – Legal Research supports several of the conclusions discussed in Part Two. For instance, VLAIR – Legal Research indicates that AI tools should not be summarily dismissed as hype, because using an AI tool on certain tasks may elevate a lawyer’s work for some use cases. It also supports the conclusion that the accuracy of OpenAI’s GPT AI models has improved significantly over the past year and a half. Finally, it is consistent with the idea that it’s a toss-up whether you can presently get better output from a general-purpose AI tool or a legal industry AI tool.  

A quick reminder: I’m currently preparing to record a CLE called “How to Pick the Best AI Tool for Your Law Practice”. Once I release the CLE, I’ll provide my newsletter subscribers with an exclusive discount code. If you already subscribe to my newsletter, thank you! If you know someone who might like access to this discount code for my newsletter subscribers, please share this issue of the newsletter with them, and encourage them to sign up for my newsletter here before the CLE is released. Additionally, if you would like me to prioritize applying for CLE accreditation in your state, please send me an email at [email protected].  

Thanks for being here.   

Jennifer Ballard
Good Journey Consulting  

 

 

[i] VLAIR - Legal Research, https://www.vals.ai/industry-reports/vlair-10-14-25#executive-summary (last visited Oct. 31, 2025).  

[ii] Id

[iii] Id

[iv] Id

[v] Id

[vi] Id

[vii] Id

[viii] Id

[ix] Id

Stay connected with news and updates!

Join our mailing list to receive the latest legal industry AI news and updates.
Don't worry, your information will not be shared.

We will not sell your information.