TY - JOUR
T1 - Improving paraphrase generation using supervised neural-based statistical machine translation framework
AU - Razaq, Abdur
AU - Shah, Babar
AU - Khan, Gohar
AU - Alfandi, Omar
AU - Ullah, Abrar
AU - Halim, Zahid
AU - Ur Rahman, Atta
N1 - Funding Information:
The authors wish to thank GIK Institute for providing research facilities. This work was sponsored by the GIK Institute graduate research fund under GA-F scheme.
Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.
PY - 2023/7/17
Y1 - 2023/7/17
N2 - In phrase generation (PG), a sentence in the natural language is changed into a new one with a different syntactic structure but having the same semantic meaning. The present sequence-to-sequence strategy aims to recall the words and structures from the training dataset rather than learning the words' semantics. As a result, the resulting statements are frequently grammatically accurate but incorrect linguistically. The neural machine translation approach suffers to handle unusual words, domain mismatch, and unfamiliar words, but it takes context well. This work presents a novel model for creating paraphrases that use neural-based statistical machine translation (NSMT). Our approach creates potential paraphrases for any source input, calculates the level of semantic similarity between text segments of any length, and encodes paraphrases in a continuous space. To evaluate the suggested model, Quora Question Pair and Microsoft Common Objects in Context benchmark datasets are used. We demonstrate that the proposed technique achieves cutting-edge performance on both datasets using automatic and human assessments. Experimental findings across tasks and datasets demonstrate that the suggested NSMT-based PG outperforms those achieved with traditional phrase-based techniques. We also show that the proposed technique may be used automatically for the development of paraphrases for a variety of languages.
AB - In phrase generation (PG), a sentence in the natural language is changed into a new one with a different syntactic structure but having the same semantic meaning. The present sequence-to-sequence strategy aims to recall the words and structures from the training dataset rather than learning the words' semantics. As a result, the resulting statements are frequently grammatically accurate but incorrect linguistically. The neural machine translation approach suffers to handle unusual words, domain mismatch, and unfamiliar words, but it takes context well. This work presents a novel model for creating paraphrases that use neural-based statistical machine translation (NSMT). Our approach creates potential paraphrases for any source input, calculates the level of semantic similarity between text segments of any length, and encodes paraphrases in a continuous space. To evaluate the suggested model, Quora Question Pair and Microsoft Common Objects in Context benchmark datasets are used. We demonstrate that the proposed technique achieves cutting-edge performance on both datasets using automatic and human assessments. Experimental findings across tasks and datasets demonstrate that the suggested NSMT-based PG outperforms those achieved with traditional phrase-based techniques. We also show that the proposed technique may be used automatically for the development of paraphrases for a variety of languages.
KW - Neural machine translation
KW - Neural-based statistical machine translation
KW - Phrase generation
KW - Statistical machine translation
UR - http://www.scopus.com/inward/record.url?scp=85164967749&partnerID=8YFLogxK
U2 - 10.1007/s00521-023-08830-4
DO - 10.1007/s00521-023-08830-4
M3 - Article
AN - SCOPUS:85164967749
SN - 0941-0643
JO - Neural Computing and Applications
JF - Neural Computing and Applications
ER -