In this month’s Tomorrow’s Trends Today, we take a look at Legal Data Lakehouses; an emerging tool that can provide insight and value to corporations, law firms, and LSPs.

The Evolution of Data Management

It has become trite to talk about the amount of data we all have to manage, analyze, and understand. The four “V”s of data – volume, velocity, variety, and veracity – are all adding complexities that even the most creative approaches to data management are failing to address.

The widespread adoption of Lotus 1-2-3 in the late-1980s brought a surge of productivity in the finance department and an appreciation of what information even modest analytical tools can provide. Banks, investment firms, insurance companies, and the finance departments of the larger corporations used the computing power available to look deeper and faster into the financials. The ability to glean insights increased the demand for data and, consequently, a need for managing this data. In the early-1990s, an architecture was designed to allow for the flow and management of data from various departments and operations to the decision centers of the organization. The “data warehouse” was born as a repository for structured, filtered, and categorized data that serves a particular purpose. It is no coincidence that when business intelligence first entered the conversation in the late-1990s, data was all about the numbers.

More recently, in order to capture all the data presenting to an organization from all sources, we have the evolution of “data lakes.” A data lake includes large amounts of raw data, even though the purpose and use of the data may not be readily apparent. The data in a data lake may be structured or unstructured. It may be images, video, or social media threads. The theory is to put everything in a central repository and, when needed, the company will have it available.

As you might imagine, there is a world of difference – both good and bad – between the specialized world of data warehouses and the dog’s breakfast of data lakes. Data warehouses are expensive, specialized, and less adaptable, but offer a greater ability to analyze the data to meet a particular business need. Data lakes are much less expensive, comprehensive, and flexible, but business insights are difficult to parse due to the amount and breadth of the data.

Data lakehouses seek to fill the capabilities chasm between data warehouses and data lakes. Since this is a new approach, there are differing definitions of what role a data lakehouse can play. At Venio, we do not see data warehouses or data lakes going away and see the data lakehouse emerging as a 1) business purpose driven platform which 2) incorporates the analytical abilities of a data warehouse 3) applied to any and all data available in a data lake.

The Legal Data Lakehouse

For lawyers and legal professionals, the application of a data lakehouse as a purpose driven platform to gain insight across the spectrum of legal responsibilities elevates and extends our role. With the insight available from a legal data lakehouse, general counsel and outside counsel can participate in leadership discussions of complex business problems. A best-in-class corporate counsel should contribute as much around strategy and driving the business as their peer executives, such as the chief financial officer.

The good news is that corporate counsel has ready access to a platform that can serve as a legal data lakehouse – its eDiscovery software. In point of fact, because eDiscovery platforms operate on the edge of the data lake rather than seeking to replace the data lake or warehouse, it is a near perfect example of the “lakehouse” metaphor. Modern eDiscovery platforms have the ability to ingest and analyze the full breadth of data within a data lake and provide advanced analytics with a tool set similar to that of a data warehouse. While the majority of eDiscovery platform users focus on litigation, many are using custom workflows to expand the uses cases to include investigations, M&A due diligence, IP analysis, HR management, and specialized contract review.

So, how can you start to use your eDiscovery platform as a legal data lakehouse? First, not all platforms are alike. We see the following as necessary attributes for a legal data lakehouse.

  1. Fully On-Prem Environment. With a data lake, it does not make sense from a cost or process standpoint to export data into an external cloud environment. While there are many sound reasons for external cloud eDiscovery, an on-prem platform allows for speed, flexibility, and cost savings.
  2. Unified Platform. Similarly, look for an eDiscovery platform where all the functions – processing, search, analytics, reporting – are included in the same code base. This provides the greatest speed and minimizes the chances that data is exported – even temporarily – to external environments.
  3. Ease of Use. The eDiscovery industry is blessed with many outstanding PMs and analysts. And while they can shift to a more business focus, widespread adoption will require the ability to spin up teams on the platform quickly. Look for a clean UI, a range of templates, and customizable workflows.

In summary, a legal data lakehouse represents the next step in the evolution of legal data management and analysis. Further, they provide the ability for corporate counsel to provide greater strategic value to the organization and the CEO. Finally, many organizations are already using a platform that can be repurposed to broaden its positive impact to the company at little or no additional cost.