Skip to Main Content

Automated Legal Document Analytics (ALDA)

There has been an exponential growth in the use of digitized legal documents in recent years. The majority of services on the Internet have associated legal documents such as Terms of Services, Privacy Policies, and Service Level Agreements (SLAs). A large corpus of court cases, judgments and compliance/regulations are now digitally available for e-discovery. Moreover, businesses are maintaining large data sets of legal contracts that they have signed with their employees, customers, and contractors. Furthermore, companies have to adhere to a variety of compliance and regulatory policies for many of these contracts, which are also increasingly digitally available. Managing and monitoring an ever-increasing dataset of legal contracts, regulations and compliance is still a very manual and labor-intensive job and can be a bottleneck in the smooth functioning of the enterprise.

Our research aims at building a system that will be built upon large scale document analytics of legal documents using various techniques from deep learning, machine learning, natural language processing, and text mining. We are working to transform legal databases from textual databases to graph-based datasets using Semantic Web technologies. Our long term goal is to develop a system that for any given action or question, can highlight all the statutes, laws and case law that might be applicable to it and offer preliminary guidance to a counsel. As a shorter-term vision, we’re looking to see if we can automatically extract elements from compliance and regulatory legal documents that govern Information Technology (IT) outsourcing/cloud computing and automatically monitor for compliance.

Project Members

Faculty from UMBC: Dr. Karuna P Joshi

Students: Lavanya Elluri, Ankur Nagar, Divya Ganapathy, Ketki JoshiSrishty Saha


This project is supported in part by a DoD supplement to NSF IUCRC Center CARTA.


  1. Ketki Joshi, Karuna P. Joshi, and S. Mittal, “A Semantic Approach for Automating Knowledge in Policies of Cyber Insurance Services“, In Proceedings of IEEE International Conference on Web Services (IEEE ICWS) 2019, July 2019.
  2. Karuna P. Joshi and A. Banerjee, “Automating Privacy Compliance Using Policy Integrated Blockchain“, in Cryptography 2019, 3(1), 7; MDPI,
  1. Lavanya Elluri and K. P. Joshi, “A Knowledge Representation of Cloud Data controls for EU GDPR Compliance“, In Proceedings of 11th IEEE International Conference on Cloud Computing (CLOUD), July 2018.
  2. A. Nagar and K. P. Joshi, “A Semantically Rich Knowledge Representation of PCI DSS for Cloud Services“, In Proceedings of 6th International IBM Cloud Academy Conference ICACON 2018, Japan, May 2018.
  1. S. Saha, K. P. Joshi, R. Frank, M. Aebig, and J. Lin, “Automated Knowledge Extraction from the Federal Acquisition Regulations System (FARS)“, In Proceedings of 2nd International Workshop on Enterprise Big Data Semantic and Analytics Modeling at IEEE International Conference on Big Data 2017, December 2017.
  2. S. Saha and K. P. Joshi, “Cognitive Assistance for Automating the Analysis of the Federal Acquisition Regulations System”, In Proceedings of AAAI Fall Symposium 2017, November 2017.
  3. S. Saha, K. P. Joshi, and A. Gupta, “A Deep Learning Approach to Understanding Cloud Service Level Agreements “, In Proceedings of Fifth International IBM Cloud Academy Conference, May 2017.
  1. K. P. Joshi, A. Gupta, S. Mittal, C. Pearce, A. Joshi, and T. Finin, “Semantic Approach to Automating Management of Big Data Privacy Policies“, In Proceedings, IEEE BigData 2016, December 2016.
  2. K. P. Joshi, A. Gupta, S. Mittal, C. Pearce, A. Joshi, and T. Finin, “ALDA : Cognitive Assistant for Legal Document Analytics“, In Proceedings, AAAI Fall Symposium 2016, September 2016.
  3. A. Gupta, S. Mittal, K. P. Joshi, C. Pearce, and A. Joshi, “Streamlining Management of Multiple Cloud Services“, In Proceedings, IEEE International Conference on Cloud Computing, June 2016.
  4. S. Mittal, K. P. Joshi, C. Pearce, and A. Joshi, “Automatic Extraction of Metrics from SLAs for Cloud Service Management“, In Proceedings, 2016 IEEE International Conference on Cloud Engineering (IC2E 2016), April 2016.
  1. S. Mittal, K. P. Joshi, C. Pearce, and A. Joshi, “Parallelizing Natural Language Techniques for Knowledge Extraction from Cloud Service Level Agreements“, In Proceedings of IEEE International Conference on Big Data, October 2015.
  2. K. P. Joshi and C. Pearce, “Automating Cloud Service Level Agreements using Semantic Technologies“, In Proceedings of CLaw Workshop, IEEE International Conference on Cloud Engineering (IC2E), March 2015.
  1. K. P. Joshi, Y. Yesha, and T. Finin, “Automating Cloud Services Lifecycle through Semantic technologies“, IEEE Transactions on Services Computing, vol.7, no.1, pp.109-122, Jan.-March 2014; doi: 10.1109/TSC.2012.41 .