The rapid rise of large language models (LLMs) has brought the legal profession to a crucial inflection point. On one side, these tools promise to streamline complex research, speed up drafting, and even transform dense legislative language into structured, usable formats. On the other, their well-documented limitations – particularly around accuracy and reliability – make their role in such a high-stakes field highly contested. This tension between opportunity and risk now sits at the heart of the debate: are LLMs ready to serve as trusted collaborators in legal practice, or must they remain closely supervised instruments – useful, but never fully independent?
Two recent studies capture this debate very well. Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools takes a critical look at proprietary AI research platforms, proposing systematic ways to evaluate their strengths and weaknesses. Meanwhile, From Text to Structure: Using Large Language Models to Support the Development of Legal Expert Systems explores a more constructive path, showing how LLMs might help convert legislative text into transparent, rule-based systems that enhance accessibility and precision. Together, these works underscore the dual nature of LLMs in law: at once a source of concern and a driver of innovation.
Link : https://x.com/chrmanning/status/1796593594654130230
An optimistic view sees these models as powerful enablers of “augmented intelligence.” This perspective highlights how LLMs can address a key bottleneck in building legal decision-support systems – tools that play a vital role in expanding access to justice.
Traditionally, encoding complex legislative text into a structured, rule-based representation – referred to as “pathways” of criteria and conclusions – is a labor-intensive and costly task. Yet this structured data is the foundation for tools like JusticeBot, which help non-lawyers understand how laws apply to them. In this study, researchers used GPT-4 to automate the extraction of pathways from 40 articles of the Civil Code of Quebec.
The findings were highly promising, demonstrating the potential of LLMs to act as efficient drafters for legal experts:
This approach positions LLMs not as replacements but as collaborators – handling the tedious initial drafting so that legal experts can focus on verification and refinement. The vision is to scale the creation of legal decision-support systems and, in doing so, expand access to justice.
In sharp contrast, the second study – assessing the reliability of leading AI legal research tools – adopted a more cautious stance. It found that, despite bold claims from vendors, these tools continue to suffer from persistent and problematic hallucinations, placing a heavy burden of supervision on legal professionals.
Link: https://x.com/random_walker/status/1796557544241901712
Legal technology providers such as LexisNexis and Thomson Reuters (Westlaw) have advertised their AI-powered research tools as “hallucination-free.”
Yet the study challenged these claims, showing that even specialized Retrieval-Augmented Generation (RAG) systems still hallucinate at significant rates.
These hallucinations create a profound ethical and practical dilemma for lawyers. Professional duties of competence and supervision obligate them to verify every AI-generated output manually.
At present, the legal field cannot escape the need for human oversight. LLMs can draft, summarize, and structure – but lawyers remain ultimately responsible for accuracy.
These two papers outline a clear axis in the conversation around legal AI. The optimistic perspective envisions LLMs as collaborative tools that expand access to justice by democratizing legal information. The cautious view, however, grounds the debate in the everyday reality of commercial tools, where hallucinations – though reduced – remain a fundamental obstacle. This shifts the human-in-the-loop role from collaborative drafting to painstaking supervision.
For legal tech to fulfill its promise, the industry must:
The future of legal AI will depend on striking this balance. While LLMs show immense potential, the continuing need for human accountability – and the lack of transparency in commercial systems – means that for now, the dream of effortless efficiency remains unrealized.
How can the legal tech industry bridge this gap and build tools that are both reliable and trustworthy?