
Typically, when building, training, and deploying AI, enterprises prioritize accuracy. And this is no doubt important; But in highly complex, nuanced industries like law, accuracy alone is not enough. Higher stakes mean higher standards: model outputs must be evaluated for relevance, authority, citation accuracy, and haziness rate.
To tackle this enormous task, LexisNexis has evolved beyond standard retrieval-augmented generation (RAG) to graph RAGs and agentic graphs; It has also been constructed "planner" And "reflection" AI agents that parse requests and critique their own output.
“Nothing like that [thing] “As ‘perfect AI’ because you never get 100% accuracy or 100% relevance, especially in a complex, high-stakes domain like legal,” Min Chen, SVP and chief AI officer at LexisNexis, admits in a new VentureBeat Beyond the Pilot podcast.
The goal is to manage that uncertainty as much as possible and convert it into consistent customer value. “At the end of the day, what matters most to us is the quality of the AI outcome, and this is an ongoing journey of experimentation, iteration, and improvement,” Chen said.
Getting ‘whole’ answers to multidimensional questions
To evaluate models and their outputs, Chen’s team has established more than half a dozen “sub metrics” to measure “usefulness” based on a number of factors – authority, citation accuracy, attribution rate – as well as “comprehensiveness.” This particular metric is designed to evaluate whether the General AI response fully addressed all aspects of users’ legal questions.
“So it’s not just about relevance,” Chen said. “Completeness speaks directly to legal credibility.”
For example, a user may ask a question that requires an answer involving five different legal ideas. General AI can provide a response that precisely addresses three of these. But, while relevant, this partial answer is incomplete and, from the user’s perspective, inadequate. This can be confusing and pose risks in real life.
Or, for example, some citations may be semantically relevant to the user’s question, but they may point to arguments or examples that were ultimately rejected in court. “Our lawyers would not consider them suitable,” Chen said. “If they’re not suitable, they’re not useful.”
Moving beyond standard RAG
LexisNexis launches its flagship General AI product, Lexis+ AI – a legal AI tool for drafting, research and analysis – in 2023. It was built on a standard RAG framework and hybrid vector search that bases responses in LexisNexis’s trusted, authoritative knowledge base.
The company then released its personal legal assistant, Protégé, in 2024. This agent incorporates a knowledge graph layer on top of vector search to overcome the “major limitation” of pure semantic search. Although “very good” at retrieving contextually relevant content, semantic search “does not always guarantee an authoritative answer," Chen said.
The initial semantic search returns content it finds relevant; Chen’s team then crosses those returns onto a “point of law” graph to filter out the most highly regarded official documents.
Building on this, Chen’s team is developing agentic graphs and accelerating automation so that agents can plan and execute complex multi-step tasks.
For example, self-directed “planning agents” for research question-answering break down a user’s questions into multiple sub-questions. Human users can review and edit the final answers to further refine and personalize them. Meanwhile, a “reflection agent” handles transactional document drafting. It can “automatically, dynamically” critique your initial draft, then incorporate that feedback and refine it in real time.
However, Chen said this isn’t all about excluding humans from the mix; Human experts and AI agents can “learn, reason, and grow together.” “I see the future [as] “Deep collaboration between humans and AI.”
Check out the podcast to learn more about it:
- How LexisNexis’ acquisition of Henchman helped build AI models with proprietary LexisNexis data and customer data;
-
Difference between deterministic and non-deterministic evaluation;
-
Why enterprises should identify KPIs and definitions of success before experimenting;
-
The importance of focusing on the “triangle” of key components: cost, speed, and quality.
You can also listen and subscribe beyond the pilot But spotify, Apple Or wherever you get your podcasts.
<a href