New Legal AI Benchmarking Report Evaluates Four AI Tools Across Seven Legal Tasks
Issue 20
On February 27, 2025, Vals AI released its Vals Legal AI Report (“VLAIR”), a first of its kind evaluation of four legal industry AI tools (CoCounsel, Vincent AI, Harvey Assistant, and Oliver), across up to seven legal tasks commonly performed by lawyers, and benchmarking their results against the results of a lawyer control group.[i] For more on this newsletter’s previous coverage of legal benchmarking, see issue 12.
Out of the seven legal tasks evaluated, one or more AI tools beat the lawyer control group on four tasks, while the lawyer control group surpassed the AI tools on two tasks and matched the highest performing tool on one task.[ii] Harvey Assistant, which participated in six of the seven tasks, had the strongest performance, receiving the top score on five tasks and the second place score on one task, and beating or matching the lawyer control group in five tasks.[iii] CoCounsel also received one top score and ranked among the best performing tools on four of the tasks.[iv]
The Tasks
The legal tasks evaluated by the study were:
- Document Extraction: this task evaluated identification and extraction of specific information within a document, with Harvey and CoCounsel surpassing the lawyer control group.[v]
- Document Question-Answering: the report concluded that lawyers should find value in using generative AI to review and analyze the information in a document, as all of the AI tools outperformed the lawyer control group at this task.[vi]
- Document Summarization: the report also found that lawyers can use AI tools for document summarization with confidence, with all of the AI tools outperforming the lawyer control group.[vii]
- Redlining: the lawyer control group beat each of the AI tools participating in the study at this task.[viii]
- Transcript Analysis: the study noted certain challenges with transcript analysis, including the potential for messy formatting, as well as nuanced information and subtleties.[ix] Both of the tools that participated in this task evaluation, Harvey Assistant and Vincent AI, outperformed the lawyer control group.[x]
- Chronology Generation: Harvey Assistant matched the lawyer control group in chronology generation.[xi]
- EDGAR Research: this task evaluated the ability to perform market-based research or answer questions about U.S. public companies in relation to the U.S. Securities and Exchange Commission’s EDGAR database.[xii] The lawyer control group outperformed the only AI tool participating in this task, Oliver; however, the report noted that the lawyer control group was able to use non-AI research tools such as Google and the EDGAR search interface to complete their task.[xiii]
Future Studies Planned by Vals AI
VLAIR is the first iteration of Vals AI’s planned regular evaluation of legal industry AI tools.[xiv] Vals AI anticipates that additional AI tool vendors will opt in to future benchmarking evaluations, and that additional tasks and skills will be evaluated.[xv] Vals AI is currently conducting a study dedicated solely to legal research that will be released later this year.[xvi]
Takeaways for Lawyers Who are Evaluating Their AI Options
VLAIR is essential reading for any lawyer who is considering adopting an AI tool that performs one or more of the tasks evaluated by this report. You can read the report in its entirety here. As noted by VLAIR, there currently is no consensus in the legal industry about which workflows hold the most generative AI potential.[xvii] VLAIR is an impressive effort to provide lawyers with practical and valuable information they can use in their decision-making about their AI options.
Considering that there are over 50 use cases for legal industry AI tools, and over 200 legal industry AI tools on the market, it’s important to recognize that the vast majority of AI tools for lawyers are unlikely to be included in independent benchmarking studies in the near future. This is a reality of navigating the AI era, where new developments are happening constantly, and the lack of comprehensive benchmarking resources should not be used as justification to exclusively consider AI tool options with benchmarking data.
Instead, lawyers who wish to identify the AI solutions that can make the greatest impact for their organizations should start by clarifying and prioritizing the problems they need to solve with AI. This requires investigating your organization’s technology problems. Where is technology currently serving the people of your organization well, and where is there room for improvement? Is there work performed in your organization that routinely gets written off? What tasks are repetitive? What tasks can be streamlined? What work could be performed more consistently and accurately with technology? Where would a new technology tool make the biggest financial impact? How receptive are the people of your organization to new technology?
Once you understand your organization’s technology problems, you’ll be in a better position to match those problems with the solutions currently available from AI tools. This is also the point in the AI tool evaluation process where benchmarking reports like VLAIR can help guide decision making. However, lawyers who are interested in AI tools that lack independent benchmarking data can still conduct their own evaluations and testing of the AI tools that they have identified as being most promising for their unique organizations. Benchmarking studies provide useful data, but they are only one part of the decision-making process. For a practical framework for evaluating AI tools in your own practice, see: How to Choose AI Tools for Your Law Practice.
Thanks for being here.
Jennifer Ballard
Good Journey Consulting
_____________________________________
[i] Executive summary, Vals Legal AI Report, https://www.vals.ai/vlair (last visited Mar. 1, 2025).
[ii] Id.
[iii] Id.
[iv] Id.
[v] Findings for each skill, Vals Legal AI Report, https://www.vals.ai/vlair (last visited Mar. 1, 2025).
[vi] Id.
[vii] Id.
[viii] Id.
[ix] Id.
[x] Id.
[xi] Id.
[xii] Id.
[xiii] Id.
[xiv] Future plans, Vals Legal AI Report, https://www.vals.ai/vlair (last visited Mar. 1, 2025).
[xv] Id.
[xvi] Methodology, Vals Legal AI Report, https://www.vals.ai/vlair (last visited Mar. 1, 2025).
[xvii] Id.
Stay current on AI developments that matter to lawyers. Get practical insights, AI risk updates, free resources, and exclusive discounts delivered to your inbox.
We will not sell your information.