Rapid strides have been made in syntactic analysis (part of speech, dependency parses) of unstructured text as well as in tasks such as Concept and entity extraction and named entity recognition. However, relation extraction from unstructured text remains a challenge. Users are often expected to handcraft relation extraction rules for their domain, specially for data in the non-consumer space (e.g., industrial domains, cybersecurity).
The goal of this project is to learn relation extraction rules with the help of user feedback and interaction in the form of positive examples and interactive chat based dialog. A possible approach is using NLP and deep learning techniques over a combination of syntactic and semantic patterns in a set of user annotated sentences and convert the patterns to a generic extraction rule. We hope this project will aid in accelerating the digitization of domain knowledge – developing algorithms to improve relation extraction from unstructured data, there by speeding the knowledge capture process.
Collaborators from GE: Dr. Varish Mulwad, Dr. Kareem Aggour
Students: Raka Dalal
This project is supported in part by GE Research.
- Agniva Banerjee, Raka Dalal, Sudip Mittal, and Karuna Pande Joshi, “Generating Digital Twin models using Knowledge Graphs for Industrial Production Lines”, Workshop on Industrial Knowledge Graphs, co-located with the 9th International ACM Web Science Conference 2017, June 2017.