This is the third in our series of blogs deconstructing the appellation of the market-leading contract discovery and analytics company, Seal. Last week we learned that the E in Seal stands for Extract. Extracting data out of contracts is fundamental to get the answers that corporations are looking for and allows them to make better decisions about revenue opportunities, regulatory compliance, risk mitigation, and cost savings. We discussed a range of pre-built extractions called Insight Accelerators which address specific use cases to do just this. These extractions are essentially sets of analytics, and it is analyzing datasets that is the subject of this post – it is the A in Seal.
My colleague, Dan Schneider recently gave a very good overview of what analytics means at a high level to Seal. Additionally, three of our leading data scientists, Emanuella Wallin, Qing Zheng, and Alexandra Kukreshfrom our Machine Learning team wrote about the role of data and trust in analytics at the end of last year. You can read their post here.
Analytics are at the heart of the Seal platform. The specific area of focus is that of Machine Learning (ML). There are many different ML processes and algorithms utilized today and Seal makes use of a number of them within a proprietary framework that defines a processing sequence. Unlike most other legal AI technologies, Seal doesn’t use just one process, e.g. Natural Language Processing (NLP), but combines multiple methods to allow the system to effectively extract information with minimal teaching of one example and ensure the relevant balance of precision and accuracy is met. The combination of the processes and methods into a single pipeline allows for the fastest and most flexible information extraction. No other vendor combines extensible NLP, Deep Learning, Machine Learning, Latent Semantic Indexing (LSI), search and sentence rules within an extraction framework. This in conjunction with the Seal Logic Engine, which takes these extracted answers and can assess risk, if/then logic, calculations, etc., allows the platform to analyze metadata and answer complex business-related questions.
Two years ago, almost to the day, our co-founder and CTO wrote a piece entitled “Not All Machine Learning is Created Equal”. It has proven to be one of the most viewed posts on our roster and for good reason. It is a deep dive on what analytics means for Seal and how it is focused ultimately on customer outcomes – after all, technology for technology’s sake gets you only so far.
I have pulled the following from his lengthy post as I think it summarizes the componentry of the Seal platform the best.
“Finally, it is the ML engine and the components that make up the broader platform that meets the precision and recall objectives for a particular clause. An ML engine cannot provide all the capabilities on its own to deliver the results that Seal does — it is several technologies and techniques working together that does it. These include:
- NLP to optimize the capabilities for the system to understand written language and process it within the ML engine
- LSI for identifying and extracting information not presented in standard terms or language, but exists through associations of words or phrases or in different locations in a document
- The use of Deep Learning methods to increase performance of the ML engine
- The inclusion of User Defined Machine Learning (UDML) to simplify training and automatically select the best model and hyperparameters for any given data, with users only required to select the text to train on
- Including document review capabilities within the system for efficient side-by-side review and comparison across clauses and language
- Extensive reporting and data visualization to be able to easily draw actionable insight from the data
- Automatic discovery and linkage of related documents such as amendments to master agreements
- Simplicity within the UI for information layering and normalization, to allow the ML framework to effectively use all available information and to allow users and engineers to quickly find and prepare it for use
The result of this combination of technologies is a platform with unique capabilities for users. This includes the flexibility for users to provide just one example to the system and meet their objectives, or efficiently provide from 50 to 300 examples for different outcomes — depending on their needs. Add to this the extremely granular extraction levels, the normalization of numeric values, our extensive APIs, and our patented non-standard clause detection, and our customers receive extremely high value from the Seal platform.”
Whilst this was written when Version 5 was the current offering, the essence of what a contract analytics platform needs to deliver still holds true. The combination of people, process and methods into a single processing pipeline allows for the fastest and most flexible information extraction – and fundamentally, that is what our Global 2000 customers care about.
So, A is for Analyze although it could also stand for accurate, ambitious and awesome!
Next week we’ll take a look at L and round out the series.