David Lillis: Combining Machine Learning and Logical Reasoning to Improve Requirements Traceability Recovery

Combining Machine Learning and Logical Reasoning to Improve Requirements Traceability Recovery

Tong Li, Shiheng Wang, David Lillis and Zhen Yang

Applied Sciences, 10(20):7253, Jan. 2020.

Abstract

Maintaining traceability links of software systems is a crucial task for software management and development. Unfortunately, dealing with traceability links are typically taken as afterthought due to time pressure. Some studies attempt to use information retrieval-based methods to automate this task, but they only concentrate on calculating the textual similarity between various software artifacts and do not take into account the properties of such artifacts. In this paper, we propose a novel traceability link recovery approach, which comprehensively measures the similarity between use cases and source code by exploring their particular properties. To this end, we leverage and combine machine learning and logical reasoning techniques. On the one hand, our method extracts features by considering the semantics of the use cases and source code, and uses a classification algorithm to train the classifier. On the other hand, we utilize the relationships between artifacts and define a series of rules to recover traceability links. In particular, we not only leverage source code\’s structural information, but also take into account the interrelationships between use cases. We have conducted a series of experiments on multiple datasets to evaluate our approach against existing approaches, the results of which show that our approach is substantially better than other methods.