Artificial Neural Networks for Text-to-SQL Task: State of the Art
Discover our conference paper Artificial Neural Networks for Text-to-SQL Task: State of the Art – International conference on smart Information & communication Technologies part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 684).
Thanks to the Novelis Research Team for their knowlegde and experience.
The database stores a large amount of data from all over the world, but to access this data, users must understand query languages such as SQL. In order to facilitate this task and make it possible to interact with databases around the world, some research has recently emerged to deal with systems that understand natural language problems and automatically convert them into SQL queries. The purpose of this article is to provide the most advanced text-to-SQL tasks, in which we show the main models and existing solutions (natural language deal with). We also specify the experimental settings for each method, their limitations, and a comparison of the best available methods.
About the study
“Text-to-SQL task is one of the most important subtask of semantic parsing in natural language processing (NLP). It maps natural language sentences to corresponding SQL queries. In recent years, some state-of-the-art methods with Seq2Seq encoder-decoder architectures (Ilya Sutskever, Oriol Vinyals, Quoc V. Le 2014)  are able to obtain more than 80% exact matching accuracy on some complex text-to-SQL benchmarks such as Atis (Price, 1990; Dahl and al., 1994) , GeoQuery (Zelle and Mooney, 1996) , Restaurants (Tang and Mooney, 2000; Popescu and al., 2003) , Scholar (Iyer and al., 2017) , Academic (Li and Jagadish, 2014) , Yelp (Yaghmazadeh and al., 2017)  and WikiSQL (Zhong and al., 2017) .These models seem to have already solved most problems in this area. However, as (Finegan-Dollak et al., 2018)  show, because of the problematic task definition in the traditional datasets, most of these mod- els just learn to match semantic parsing results, rather than truly learn to understand the meanings of inputs and generalize to new programs and databases, which led to low precisions on more generic dataset as the case of Spider (YU, Tao, ZHANG, Rui, YANG, Kai, and al. 2018) .”