Abstract
Source code is rarely written in isolation. It depends significantly on the programmatic context, such as the class that the code would reside in. To study this phenomenon, we introduce the task of generating class member functions given English documentation and the programmatic context provided by the rest of the class. This task is challenging because the desired code can vary greatly depending on the functionality the class provides (e.g., a sort function may or may not be available when we are asked to “return the smallest element” in a particular member variable list). We introduce CONCODE, a new large dataset with over 100,000 examples consisting of Java classes from online code repositories, and develop a new encoder-decoder architecture that models the interaction between the method documentation and the class environment. We also present a detailed error analysis suggesting that there is significant room for future work on this task.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing |
Publisher | Association for Computational Linguistics |
Pages | 1643-1652 |
Number of pages | 10 |
ISBN (Electronic) | 9781948087841 |
Publication status | Published - 31 Oct 2018 |
Event | 2018 Conference on Empirical Methods in Natural Language Processing - Brussels, Belgium Duration: 31 Oct 2018 → 4 Nov 2018 |
Conference
Conference | 2018 Conference on Empirical Methods in Natural Language Processing |
---|---|
Abbreviated title | EMNLP 2018 |
Country/Territory | Belgium |
City | Brussels |
Period | 31/10/18 → 4/11/18 |