Srishty Saha

MS, Computer Science, August 2018

Srishty Saha worked on the Automated Legal Document Analytics (ALDA) project using techniques from NLP, Deep Learning and Semantic Web concepts. Her research interests include Machine Learning and Text Analytics.

She successfully defended her Master’s thesis in May 2018. Srishty is currently working at Microsoft.

Thesis Title: Semantically Rich Framework to Automate Extraction and Representation of Legal Knowledge

Committee: Dr. Karuna P. Joshi (Chair), Dr. Tim Finin, Dr. Milton Halem, Dr. Yelena Yesha

Thesis Abstract:  With the explosive growth in cloud-based services, businesses are increasingly maintaining large datasets containing information about their consumers to provide a seamless user experience. To ensure privacy and security of these datasets, regulatory bodies have specified rules and compliance policies that must be adhered to by organizations. These regulatory policies are currently available as text documents that are not machine processable and so require extensive manual effort to monitor them continuously to ensure data compliance. We have developed a cognitive framework to automatically parse and extract knowledge from legal documents and represent it using an Ontology.  The legal ontology captures key-entities and their relations, the provenance of legal-policy and cross-referenced semantically similar legal facts and rules. We have applied this framework to the United States government’s Code of Federal Regulations (CFR) which includes facts and rules for individuals and organizations seeking to do business with the US Federal government.