Skip to Main Content

Automated Legal Document Analytics (ALDA)

There has been an exponential growth in use of digitized legal documents in recent years. Majority of services on the Internet have associated legal documents such as Terms of Services, Privacy Policies and Service Level agreements. A large corpus of court cases, judgments and compliance/regulations are now digitally available for e-discovery. Moreover, businesses are maintaining large data sets of legal contracts that they have signed with their employees, customers and contractors. Furthermore, companies have to adhere to a variety of compliance and regulatory policies for many of these contracts, which are also increasingly digitally available. Managing and monitoring an ever increasing dataset of legal contracts, regulations and compliance is still a very manual and labour intensive job and can be a bottleneck in the smooth functioning of the enterprise.

Our research aims at building a Legal Question and Answer (LQnA) system that will be built upon large scale document analytics of legal documents using various techniques from deep learning, machine learning, natural language processing and text mining. We are working to transform legal databases from textual databases to graph-based datasets using Semantic Web technologies. Our long term goal is to develop a system that for any given action or question, can highlight all the statutes, laws and case law that might be applicable on it and offer preliminary guidance to a counsel. As a shorter term vision, we’re looking to see if we can automatically extract elements from compliance and regulatory legal documents that govern Information Technology (IT) outsourcing/cloud computing and automatically monitor for compliance.


This project is supported in part by NSF/DoD.

Publications and Affiliated Faculty/Students