Skip to main content

Extraction of Natural Language Requirements from Breach Reports Using Event Inference

Guo, Hui, Kafalı, Özgur, Singh, Munindar (2018) Extraction of Natural Language Requirements from Breach Reports Using Event Inference. In: 2018 5th International Workshop on Artificial Intelligence for Requirements Engineering (AIRE). . pp. 22-28. IEEE ISBN 978-1-5386-8404-7. (doi:10.1109/AIRE.2018.00009) (Access to this publication is currently restricted. You may be able to access a copy if URLs are provided) (KAR id:73790)

PDF Publisher pdf
Language: English

Restricted to Repository staff only
Contact us about this Publication
Official URL


We address the problem of extracting useful information contained in security and privacy breach reports. A breach report tells a short story describing how a breach happened and the follow-up remedial actions taken by the responsible parties. By predicting sentences that may follow a breach description using natural language processing, our goal is to suggest security and privacy requirements for practitioners and end users that can be used to prevent and recover from such breaches. We prepare a curated dataset of structured short breach stories using unstructured breach reports published by the U.S. Department of Health and Human Services. We propose a prediction model for inferring held-out sentences based on Paragraph Vector, a document embedding method, and Long Short-Term Memory networks. The predicted sentences can suggest natural language requirements. We evaluate our model on the curated dataset as well as the ROCStories corpus, a collection of five-sentence commonsense stories, and find that the presented model performs significantly better than the baseline of using average word vectors.

Item Type: Conference or workshop item (Proceeding)
DOI/Identification number: 10.1109/AIRE.2018.00009
Uncontrolled keywords: data privacy;natural language processing;recurrent neural nets;security of data;text analysis;security;privacy requirements;curated dataset;structured short breach stories;unstructured breach reports;prediction model;Long Short-Term Memory networks;predicted sentences;natural language requirements;five-sentence commonsense stories;event inference;privacy breach reports;breach description;natural language processing;ROCStories corpus;average word vectors;US Department of Health and Human Services;Task analysis;Predictive models;Security;Natural languages;Privacy;Recurrent neural networks;Semantics;Event inference;Story Cloze Test;Security and privacy requirements;Breach reports;Recurrent Neural Networks;Long Short-Term Memory architecture
Divisions: Faculties > Sciences > School of Computing
Depositing User: Ozgur Kafali
Date Deposited: 08 May 2019 09:08 UTC
Last Modified: 23 Jan 2020 04:16 UTC
Resource URI: (The current URI for this page, for reference purposes)
Kafalı, Özgur:
  • Depositors only (login required):


Downloads per month over past year