Tweet | |
R. Takaichi, Y. Higo, S. Matsumoto, S. Kusumoto, T. Kurabayashi, H. Kirinuki, and H. Tanno, "Are NLP Metrics Suitable for Evaluating Generated Code?," In Proceedings of the 23rd International Conference on Product-Focused Software Process Improvement (PROFES2022), pp. 531-537, November 2022. | |
ID | 768 |
Category | International Conference |
Tags | automated metric code generation deep learning |
Title (title) |
Are NLP Metrics Suitable for Evaluating Generated Code? |
Title in English |
|
Author (author) |
Riku Takaichi,Yoshiki Higo,Shinsuke Matsumoto,Shinji Kusumoto,Toshiyuki Kurabayashi,Hiroyuki Kirinuki,Haruto Tanno |
Author in English |
Riku Takaichi,Yoshiki Higo,Shinsuke Matsumoto,Shinji Kusumoto,Toshiyuki Kurabayashi,Hiroyuki Kirinuki,Haruto Tanno |
Editor (editor) |
|
Editor in English |
|
Key (key) |
Riku Takaichi,Yoshiki Higo,Shinsuke Matsumoto,Shinji Kusumoto,Toshiyuki Kurabayashi,Hiroyuki Kirinuki,Haruto Tanno |
Title of Book or Proceedings (booktitle) |
Proceedings of the 23rd International Conference on Product-Focused Software Process Improvement (PROFES2022) |
Title of Book or Proceedings in English |
|
Volume (volume) |
|
Number (number) |
|
Pages (pages) |
531-537 |
Organization (organization) |
|
Publisher (publisher) |
|
Publisher in English |
|
Address (address) |
|
Month (month) |
11 |
Year (year) |
2022 |
Acceptance rate |
|
URL |
|
Note (note) |
|
Annote (annote) |
|
Abstract |
Code generation is a technique that generates program source code without human intervention. There has been much research on automated methods for writing code, such as code generation. However, many techniques are still in their infancy and often generate syntactically incorrect code. Therefore, automated metrics used in natural language processing (NLP) are occasionally used to evaluate existing techniques in code generation. At present, it is unclear which metrics in NLP are more suitable than others for evaluating generated codes. In this study, we clarify which NLP metrics are applicable to syntactically incorrect code and suitable for the evaluation of techniques that automatically generate codes. Our results show that METEOR is the best of the automated metrics compared in this study. |
Electronic file | r-takaic_202211_profes.pdf (application/pdf) [Open] |
BiBTeX entry |
@inproceedings{id768, title = {Are {NLP} Metrics Suitable for Evaluating Generated Code?}, author = {Riku Takaichi and Yoshiki Higo and Shinsuke Matsumoto and Shinji Kusumoto and Toshiyuki Kurabayashi and Hiroyuki Kirinuki and Haruto Tanno}, booktitle = {Proceedings of the 23rd International Conference on Product-Focused Software Process Improvement (PROFES2022)}, pages = {531-537}, month = {11}, year = {2022}, } |