Over the past year, there’s been a marked increase in interest related to artificial intelligence (AI), especially Generative AI, also known as GenAI. Technology vendors and industry pundits have made far-reaching claims about AI’s long-term potential and immediate benefits.
That said, many of the assertions about AI blur the line between state of the art and science fiction. On the one hand, AI-powered technology clearly has a place in enhancing legal teams’ efficiency. Investment banks, law firms, private equity and venture capital funds, direct lenders, and other private capital markets participants are increasingly adopting AI solutions to manage legal workflows. Firms that don’t embrace AI risk being left behind by competitors.
On the other hand, AI still has its limitations, and the best solutions for sophisticated legal workflows often require oversight from legal professionals. For internal legal teams to avail themselves of the tremendous potential AI offers, they need to understand the underpinnings of AI, its associated strengths, and potential shortcomings.
This and upcoming articles will help you:
- Ask the right AI-related questions
- Evaluate purported AI capabilities
- Decide how AI can bring the greatest positive impact to your work
AI is a broad field focused on the creation of machines capable of performing complex tasks that typically require human intelligence, such as understanding and generating language and making decisions.
When talking about AI, it’s important to understand the various technologies it encompasses. AI depends on machine learning (ML), which gives machines the ability to learn from experience without being explicitly programmed by humans. Machine learning uses algorithms and statistical models to analyze and learn from patterns in data.
Natural language processing (NLP) is a field of machine learning in which machines can understand language as people speak and write it, which enables machines to recognize, understand, translate, and generate text.
AI developers commonly use neural networks, a specific class of machine learning algorithms inspired by the human brain. These networks contain numerous interconnected neurons. Each neuron performs a specific function, processing inputs and producing outputs that it sends to other neurons. Deep learning networks are multi-layered neural networks that can learn to approximate almost any function.
Additionally, large language models (LLMs), like OpenAI’s GPT-4, are a type of deep learning neural network that can perform natural language processing tasks. These models are referred to as large because of the number of parameters in the model (possibly in the billions) and the amount of data involved.
How AI works
AI systems work by using algorithms to analyze data and identify patterns and relationships in that data. Developers can train algorithms by using machine learning techniques, which involve providing the AI system with large amounts of data and adjusting the system’s parameters until it can accurately perform a given task on new data.
The success of an AI application depends on two components:
- Its models, the systems used to learn from data, and
- The volume and quality of the data the developers use for training.
Models: Models can be either open source or closed source (also known as proprietary). Open source code is generally available to the public, and depending on the license, parties may be able to access, modify, and distribute the model royalty free. Proprietary models may contain open source code but rely on private source data to deliver unique capabilities. Only authorized parties may be able to access these models.
Some AI companies have created a new type of proprietary model where the public can use their models through interfaces the company controls. Additionally, these companies may provide for enterprise-class consumption of these models, allowing other organizations to build functionality on top of their models without hosting the models themselves. These enterprise relationships come with additional protections and performance commitments to the companies using them.
Data: During the training process, models are exposed to large quantities of labeled or unlabeled data to learn how to perform specific tasks. These datasets can also be either open source or proprietary. The higher the quantity and quality of the dataset used to train a model, the higher the quality of the final model.
A model’s accuracy depends on the volume and relevance of the training data used. For example, if a model is trained to recognize a standstill clause in a non-disclosure agreement using an open source dataset derived from internet searches, the model would yield inferior results compared to one trained on an extensive NDA data repository.
Ontra’s AI engine, Ontra Synapse, works by combining enterprise-grade usage of OpenAI’s GPT-4 with our proprietary models to perform tasks and generate answers specific to private capital markets’ legal workflows. Most importantly, our AI capabilities are developed on Ontra’s unique dataset — the industry’s leading repository of nearly one million industry documents.
The amalgamation of commercial and proprietary LLMs and industry-specific data, combined with Ontra’s human-in-the-loop approach, enables Ontra Synapse to generate outputs that outshine other legal AI solutions in terms of relevance and accuracy for private market firms.
Different types of AI models
AI models are broadly classified as predictive or generative.
Predictive models make decisions or predictions about future outcomes (for example, predicting the complexity of an upcoming contract negotiation) by identifying patterns and trends in data.
They can deliver consistent, accurate results when trained on high volumes of relevant information. They can be used to automate many manual tasks that require minimal human oversight. However, the quality of their outputs declines precipitously with poor training data.
Generative models create unique text, images, audio, and synthetic data (for example, drafting a legal clause) by mimicking content they have previously analyzed.
They can allow legal professionals to tackle use cases that require context-specific text-based responses.
Unfortunately, generative models have two considerable shortcomings. First, they’re prone to hallucinating — fabricating baseless assertions that they present as fact. And second, generating inconsistent answers to the same sets of questions. For these reasons, generative AI requires humans-in-the-loop to validate outputs — professionals familiar with the subject matter and the way in which an organization will use the AI outputs.
Ontra Synapse uses a blend of industry-specific and commercial predictive and generative models. To ensure model outputs meet the exacting standards of the private funds industry, we use the highest quality training data derived from our industry-leading routine contract repository. We also employ a global network of highly trained legal and contract professionals to review the information those models produce.
How to measure the quality of AI outputs
Whether the legal and private funds industries can benefit from predictive and generative models depends on the quality of the outputs. Organizations can use numerous metrics to evaluate an AI model, including recall, precision, and F1 scores.
Recall attempts to measure the proportion of actual positives a model correctly identifies. For example, of 100 contracts, if a model predicts 90 contracts contain standstill clauses when, in fact, 100 do, then recall equals 90%.
Precision refers to the accuracy of a model’s predictions and is calculated by dividing the number of true positives by the total number of predicted positives (both true and false). For example, of 100 contracts, if a model predicts 100 contracts contain standstill clauses and only 60 actually do, then the precision is 60%.
An F1 score combines precision and recall into one blended metric.
While recall, precision, and F1 scores are most helpful in determining the quality of a predictive AI model’s outputs, these metrics may be less helpful when measuring generative AI outputs. That’s because opinions may vary when it comes to assessing the quality of a legal clause drafted by AI.
To overcome the challenge of measuring the quality of generative model outputs, machine learning engineers can monitor how frequently subject matter experts (for example, the lawyers who use AI tools) accept the outputs and the degree to which these professionals modify them.
Since no single measurement can effectively represent all facets of an AI solution, Ontra Synapse uses a blend of metrics to ensure model outputs are reliable, relevant, and consistent.
Should I trust AI?
Given mounting legal demands, private capital markets firms will need to rely more heavily on AI to automate and optimize today’s manual legal workflows in the coming years.
Firms can do so with confidence when they understand the technology’s limitations and work with reputable vendors that have sufficiently addressed these issues. As discussed earlier, poor training data can undermine the quality of any AI output. Additionally, hallucinations and non-deterministic (i.e. inconsistent) outputs are two of the undesirable byproducts of generative AI models. Solutions that recognize the limitations and challenges of using AI and integrate human-in-the-loop validation will produce better and more relevant outputs.
Fortunately, vendors that have sourced large, use case-specific datasets and employ subject matter experts to verify quality are capable of delivering transformational outcomes with minimal risk. For that reason, Ontra has invested in building a complete AI solution comprising world-class data, purpose-built and commercial models, and a global network of legal and contract professionals.
The bottom line
- Ontra developed a proprietary blend of predictive and generative AI models tailored to private capital markets’ use cases.
- Ontra integrated OpenAI’s GPT-4 into Ontra Synapse to take advantage of leading commercial LLMs.
- Ontra trains models on anonymized and aggregated data from nearly one million industry-related negotiations.
- Ontra’s legal network members address common AI limitations by reviewing the accuracy and relevancy of the Ontra Synapse outputs and providing feedback to further train our models.