ELTE logo ELTE Eötvös Loránd University
ANNALES Universitatis Scientiarum Budapestinensis de Rolando Eötvös Nominatae
Sectio Computatorica

Volumes » Volume 57 (2024)

https://doi.org/10.71352/ac.57.143

Graph-based duplicated code detection with RefactorErl

István Bozó, Zsófia Erdei and Melinda Tóth

Abstract. Code duplicates are created for various reasons such as code reuse by copying existing fragments of code (copy-and-paste programming). Considering the huge amount of duplicated code and its maintenance cost in large software systems, it is crucial to detect code clones.
In this paper we give a graph-based algorithm which uses the semantic program graph generated by the tool RefactorErl, a source code comprehension and refactoring tool for the programming language Erlang, to find different types of code clones in the source code. The presented algorithm was able to efficiently detect not only textually identical code fragments (Type I) but also copied and slightly modified code fragments, such as changed, added or removed expressions, in addition to variations in identifiers, literals, types, whitespace characters, layout and comments (Type II, Type III).

Full text PDF
Journal cover