Lavanya Elluri is a Ph.D. Candidate at the University of Maryland, Baltimore County (UMBC) advised by Dr. Karuna Joshi. Her research interests lie in Cloud Computing, Data Analytics, and Data Protection. She is working on the ALDA project, looking at automated knowledge extraction from short text documents referencing data protection regulations like GDPR and PCI-DSS.
Currently, she is working as Database Engineer at Rei Systems, Maryland
- Lavanya Elluri, Ankur Nagar, and Karuna P. Joshi, “An Integrated Knowledge Graph to Automate GDPR and PCI DSS Compliance“, In Proceedings of IEEE International Conference on Big Data 2018, December 2018
- Lavanya Elluri and Karuna P. Joshi, “A Knowledge Representation of Cloud Data controls for EU GDPR Compliance“, InProceedings, 11th IEEE International Conference on Cloud Computing (CLOUD), July 2018.
Lavanya successfully defended her Ph.D. Proposal in November 2019.
Proposal: A Semantically Rich Framework to Enable Real-time Knowledge Extraction and Classification from Short Length Semi-Structured Documents
Committee: Dr. Karuna P Joshi (Chair), Dr. Zhiyuan Chen, Dr. Aryya Gangopadhyay, Dr. George Karabatis, Dr. Tim Finin
Abstract: Organizations with authority have power or control in a domain or sphere that they monitor and administrate. To ensure the smooth and secure operation of their sphere, authorities formulate policies and rules governing the domain which the other organizations and individuals, operating in that sphere, must comply. The knowledge about the Authority’s policies and rules is typically maintained as a Large Volume of unstructured text data in books, laws, and regulations, Academic and Scientific Reports, etc. Most of these text documents are often not machine-processable. Hence it is hard to find relevant information from these texts quickly. Extracting and categorizing knowledge from the text of these numerous authority documents requires significant manual effort and time and organizations often spend significant resources in complying with the authority controls. Organizations that adhere to the authority policies, often refer to short sections of the authority’s documents in the documents they create for their internal consumption or for their clients. However, these short sections in the referring documents do not include the full context of that section in the authority document. Thus, a person relying on the referring document must manually reference the authority’s document to determine the complete context of the authority. As both documents are not machine-processable, it is difficult to determine the context of the referring section in real-time.
We propose a semantically rich framework to extract and classify the context of a short text in real-time, to help enable users that update their referential documents regularly based on the authority documents. An open challenge that we will address is automated incremental text classification and identifying context from short text documents. Additionally, we will also correlate rules implemented in the referencing document with the rules in authority’s original policies to determine context similarity. We use techniques from Semantic Web, Natural Language Processing and Deep Learning to build this framework. Our objectives include representing Knowledge in Cloud compliance / legal texts to create and populate a Knowledge graph based on Data protection regulations.