Dr. Lavanya Elluri

Assistant Professor at Texas A&M University – Central Texas

Ph.D. , Information Systems, 2022

Dr. Lavanya Elluri was advised by Dr. Karuna Joshi from 2018-2022. Her research interests lie in Cloud Computing, Data Analytics, and Data Protection. She worked on the ALDA project, looking at automated knowledge extraction from short text documents referencing data protection regulations like GDPR and PCI-DSS.

She is currently an Assistant Professor at Texas A&M University – Central Texas.

Publications


Lavanya successfully graduated in May 2022.


Lavanya successfully defended her thesis, A Semantically Rich Framework to Enable Real-time Knowledge Extraction and Classification from Short Length Semi-Structured Documents, in November 2021.

Committee: Dr. Karuna Pande Joshi (Chair), Dr. Tim Finin, Dr. Aryya Gangopadhyay, Dr. George Karabatis, Dr. Zhiyuan Chen,

Abstract: Regulatory bodies have power or control in a domain or sphere that they monitor and administrate. To ensure the smooth and secure operation of their sphere, authorities formulate policies and rules governing the domain which the other organizations and individuals, operating in that sphere, must comply with. The knowledge about the Authority’s policies and rules is typically maintained as a large volume of unstructured text data in books, laws, and regulations, academic and scientific reports, etc. Most of these text documents are often not machine-processable. Hence it is hard to find relevant information from these texts quickly. Extracting and categorizing knowledge from the text of these numerous authority documents requires significant manual effort and time and organizations often spend significant resources in complying with the authority controls. Organizations that adhere to the authority policies, often refer to short sections of the authority’s documents in the documents they create for their internal consumption or for their clients. However, these short sections in the referring documents do not include the full context of that section in the authority document. Thus, a person relying on the referring document must manually reference the authority’s document to determine the complete context of the authority. As both documents are not machine-processable, it is difficult to determine the context of the referring section in real-time.
We propose a semantically rich framework to extract and classify the context of a short text in real-time, to help enable users to update their referential documents regularly based on the authority documents. An open challenge that we will address is automated text classification and identifying context from short text documents. Additionally, we will also populate the knowledge extracted from the authority and the referencing documents in the knowledge graphs. We use techniques from Semantic Web, Natural Language Processing, Machine Learning, and Deep Learning to build this framework. Our objectives include representing Knowledge in Cloud compliance or legal texts to create and populate a knowledge graph based on data protection regulations.


Thesis Proposal, November 2021

Proposal: A Semantically Rich Framework to Enable Real-time Knowledge Extraction and Classification from Short Length Semi-Structured Documents

Committee: Dr. Karuna P Joshi (Chair), Dr. Zhiyuan Chen, Dr.  Aryya Gangopadhyay, Dr.  George Karabatis, Dr.  Tim Finin

Abstract: Organizations with authority have power or control in a domain or sphere that they monitor and administrate. To ensure the smooth and secure operation of their sphere, authorities formulate policies and rules governing the domain which the other organizations and individuals, operating in that sphere, must comply. The knowledge about the Authority’s policies and rules is typically maintained as a Large Volume of unstructured text data in books, laws, and regulations, Academic and Scientific Reports, etc. Most of these text documents are often not machine-processable. Hence it is hard to find relevant information from these texts quickly. Extracting and categorizing knowledge from the text of these numerous authority documents requires significant manual effort and time and organizations often spend significant resources in complying with the authority controls. Organizations that adhere to the authority policies, often refer to short sections of the authority’s documents in the documents they create for their internal consumption or for their clients. However, these short sections in the referring documents do not include the full context of that section in the authority document. Thus, a person relying on the referring document must manually reference the authority’s document to determine the complete context of the authority. As both documents are not machine-processable, it is difficult to determine the context of the referring section in real-time.

We propose a semantically rich framework to extract and classify the context of a short text in real-time, to help enable users that update their referential documents regularly based on the authority documents. An open challenge that we will address is automated incremental text classification and identifying context from short text documents. Additionally, we will also correlate rules implemented in the referencing document with the rules in authority’s original policies to determine context similarity. We use techniques from Semantic Web, Natural Language Processing and Deep Learning to build this framework. Our objectives include representing Knowledge in Cloud compliance / legal texts to create and populate a Knowledge graph based on Data protection regulations.