Creating a Contract Discovery and Analytics Platform

There are many new entrants into the field of Contract Discovery and Analytics. This includes a recent announcement from a CLM vendor about their own ML-based system for extracting data from contract documents prior to loading into their CLM repository.

We like seeing companies entering our space as it acknowledges the significant demand for our software. But what doesn’t come through in the messaging of these new entrants is real clarity about their technology and capabilities, and the difficulty of what we are doing. At Seal, we’ve learned that providing a truly practical and effective platform for accurate and scalable contract discovery and analytics takes years of hard work.

Seal was founded by two gentlemen who were heading up a successful eDiscovery company. They noticed many of their customers were asking about if the technology could be used for contract documents. They complained of contracts getting lost across multiple departments, failed attempts at technology to solve this problem, and having to resort to manual reviews to find data which are extremely time consuming and expensive. It was 2010, and our founders realized there was a real opportunity to apply technology to the problems of contract discovery, data extraction, and data analysis. The following year, the team starting using the term “cDiscovery” to describe the functions based on eDiscovery but targeted at contracts, essentially inventing this category and term which is widely used today.

That was 7 years ago, and the evolution of the Seal platform has been extensive, and driven through early success and feedback from Seal’s initial customers. Their requirements stretched the Seal development team into adding new capabilities and technologies which have consistently proven themselves very useful across the customer base.

One example of this was with Avusa, now Times Media Group, and their issues centered on image format contract files. When Seal engaged them, our we pushed our technology to tell them which OCR scans were missing data, and which ones to re-scan. We also enriched the data from poor scans using our learning function to fill in the blanks. The result of this engagement was that Seal processed, converted, determined what files were missing and what needed a second OCR, extracted the data, then loaded them into a new CLM system. We did this with 40K items inside a 15-day window by throwing everything we had at this, and learning a lot in the process.

Another engagement that pushed us in the early years was with Thomson Reuters. In this case, we had to search documents in 6 languages and overlay data from other systems. The result of a ton of hard work was a successful migration of 1.2M documents from EMC Documentum, SharePoint, home grown repositories, Oracle, and OpenText into Salesforce. Whew!

What many of the new vendors to our space must deal with is the nature of AI-based systems. Whether it is task-based AI, or Question/Answer-based AI, all systems must be extensively “trained” to accurately extract information from text. The head start Seal has in this industry is not just about the technology, but also the fact that we’ve run over 50 million contracts from our F1000 customers through the system, and each have helped teach the Seal engine to understand contract language. Our system now extracts over 100 contract elements right out-of-the-box due to this tuning and training over time. The point of this is when a new vendor jumps into this market, they have years of work and millions of contracts to process before they can replicate the training, and resulting effectiveness of the Seal system.

To our new competitors, we love that you have seen for yourself the strong need in the market for contract discovery and analytics, but the heavy lifting is still ahead. The work Seal has done to evolve the platform over 7 years and millions of contracts will be difficult to replicate, if you even can. Good luck to you all.