J. Yang, "Classification Model for Code Clones Based on Machine Learning," , February 2013.
ID 273
分類 学位論文
タグ Clone Code Classification Machine Learning Fica
表題 (title) Classification Model for Code Clones Based on Machine Learning
表題 (英文)
著者名 (author) Jiachen Yang
英文著者名 (author) Jiachen Yang
キー (key) Jiachen Yang
刊行月 (month) 2
出版年 (year) 2013
刊行形式 (howpublished)
URL
付加情報 (note)
注釈 (annote)
内容梗概 (abstract) Code clones have gained great attentions in recent research. Several code clone detection methods have been proposed to detect identical or similar code fragments from source code of software. These code clones are introduced into software systems by various operations
during development, namely copy-and-paste or machine generated source code. Despite of its commonly occurrence in software development, code clones are generally considered harmful as they make software maintenance more difficult and indicate poor quality of source code.
If we modified a code fragment, it will be necessary to check all corresponded code clones whether they need modifications simultaneously.

By applying code clone detectors to the source codes, users such as programmers can obtain a list of all code clones of a given code fragments, which is useful during modifications to the source code. However, results from code clone detectors may contain plentiful
useless code clones, and judging whether each code clone is useful varies from user to user based on different purposes
of them. So it is difficult to just adjust the parameters of code clone detectors and expect to get the desired code clones. It is also a painful task to analyze through the entire list that the code clone detector generated.

In this research we proposed a classification model by applying machine learning algorithm on the judgments of each individual user on code clones. And we experimented the proposed model by an on-line survey to test its usability and accuracy with 33 participants contributed.

The result showed several important observations on the characteristics about the interesting-ness of code clones for the users. And our classification model showed more than 70\% accuracy in average and more than 90\% accuracy for particular user and source code project.
And during this research, several important observations were obtained about the interesting-ness of code clones.

論文電子ファイル fica-thesis.pdf (application/pdf) [一般閲覧可]
BiBTeXエントリ
@misc{id273,
         title = {Classification Model for Code Clones Based on Machine Learning},
        author = {Jiachen Yang},
         month = {2},
          year = {2013},
}