Document worth reading: “Unsupervised Pre-training for Natural Language Generation: A Literature Review”

Recently, unsupervised pre-training is gaining rising repute inside the realm of computational linguistics, resulting from its stunning success in advancing pure language understanding (NLU) and the potential to efficiently exploit large-scale unlabelled corpus. However, regardless of the success in NLU, the power of unsupervised pre-training is simply partially excavated in relation to pure language know-how (NLG). The fundamental obstacle stems from an idiosyncratic nature of NLG: Texts are usually generated based mostly totally on certain context, which might differ with the objective functions. As a consequence, it is intractable to design a standard construction for pre-training as in NLU eventualities. Moreover, retaining the data found from pre-training when finding out on the objective course of is usually a non-trivial disadvantage. This overview summarizes the newest efforts to bolster NLG packages with unsupervised pre-training, with a specific cope with the methods to catalyse the mixture of pre-trained fashions into downstream duties. They are labeled into architecture-based methods and strategy-based methods, based mostly totally on their methodology of coping with the above obstacle. Discussions are moreover provided to offer further insights into the connection between these two traces of labor, some informative empirical phenomenons, along with some potential directions the place future work could also be devoted to. Unsupervised Pre-training for Natural Language Generation: A Literature Review