研究生: |
劉慈恩 Liu, Tzu-En |
---|---|
論文名稱: |
應用階層式語意暨聲學特徵表示於語音文件摘要之研究 Spoken Document Summarization Leveraging Hierarchical Semantic and Acoustic Representations |
指導教授: |
陳柏琳
Chen, Berlin |
學位類別: |
碩士 Master |
系所名稱: |
資訊工程學系 Department of Computer Science and Information Engineering |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 中文 |
論文頁數: | 58 |
中文關鍵詞: | 語音文件 、節錄式摘要 、類神經網路 、階層式語意表示 、注意力機制 、聲學特徵 、次詞向量 、強化學習 |
英文關鍵詞: | Spoken Documents, Extractive Summarization, Deep Neural Networks, Hierarchical Semantic Representations, Attention Mechanism, Acoustic Features, Subword Embedding, Reinforcement Learning |
DOI URL: | http://doi.org/10.6345/NTNU201900878 |
論文種類: | 學術論文 |
相關次數: | 點閱:124 下載:8 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
由於巨量資訊的快速傳播,如何有效率地瀏覽資料是ㄧ項重要的課題。對於多媒體文件而言,語音是其內容中具有語意的主要元素之一,能夠相當完整的表達整份多媒體文件。近年來,有許多研究紛紛針對多媒體文件的理解與檢索進行深入的研究探討,並且有優異的成果與貢獻,如影像摘要、音訊摘要及影片摘要。
文件摘要可概分為節錄式 (Extractive) 和重寫式 (Abstractive) 摘要。其中節錄式摘要會依固定的比例,從文件中選出具代表性的文句組成其摘要結果;而重寫式摘要主要會先完整理解整份文件中的隱含意義,之後會根據其隱含意義,並使用不同的文詞,產生一個簡短版本的文件描述即為摘要。由於重寫式摘要對於自動語音摘要任務的困難度較高,故目前的研究大多是以節錄式摘要方式為主流。
本論文主要探討新穎的節錄式摘要方法於語音文件摘要任務上的應用,並深入研究如何改善語音文件摘要之成效。因此,我們提出以類神經網路為基礎之摘要摘要模型,運用階層式的架構及注意力機制深層次地理解文件蘊含的主旨,並以強化學習輔助訓練模型根據文件主旨選取並排序具代表性的語句組成摘要。同時,我們為了避免語音辨識的錯誤影響摘要結果,也將語音文件中相關的聲學特徵加入模型訓練以及使用次詞向量作為輸入。最後我們在中文廣播新聞語料(MATBN)上進行一系列的實驗與分析,從實驗結果中可驗證本論文提出之假設且在摘要成效上有顯著的提升。
With the rapid spread of tremendous amounts of multimedia information, how to browse the associated content efficiently becomes an important issue. Speech is one of the primary sources of semantics in multimedia content, by listening to which we can digest the content in a more complete manner. In recent years, many studies have conducted in-depth research and discussion on understanding and retrieval of multimedia documents, achieving excellent performance and making substantial contributions on a wide array of tasks, such as image caption, audio summarization and video caption.
Document summarization methods can be broadly divided into two categories: extraction-based and abstraction-based methods. The former ones select a representative set sentences from the document to produce a summary according to a predefined summarization ratio whilst preserving its important information. The latter ones manage to understand a whole document, and then produce a short version of the document based on its main theme. Due to abstractive summarization is still far from being satisfied for either text or spoken documents, most of current studies focus exclusively on the development of extraction-based summarization methods.
This thesis set to explore novel and effective extractive methods for spoken document summarization. To this end, we propose a neural summarization approach leveraging a hierarchical modeling structure with an attention mechanism to understand a document deeply, and in turn to select representative sentences as its summary. Meanwhile, for alleviating the negative effect of speech recognition errors, we make use of acoustic features and subword-level input representations for the proposed approach. Finally, we conduct a series of experiments on the Mandarin Broadcast News (MATBN) Corpus. The experimental results confirm the utility of our approach which improves the performance of state-of-the-art ones.
[Alguliev et al., 2011] R. M.Alguliev, R. M.Aliguliyev, M. S. Hajirahimova, and C. A.Mehdiyev, “MCMR: Maximum coverage and minimum redundant text summarization model,” Expert Systems with Applications, vol. 38, no. 12, pp. 14514-14522, 2011.
[Alguliev et al., 2013] R. M Alguliev, R. M Aluguliyev, and N. R Isazade, “Multiple documents summarization based on evolutionary optimization algorithm,” Expert Systems with Applications, vol. 40, no. 5, pp. 1675-1689, 2013.
[Allahyari and Kochut, 2015] M. Allahyari and K. Kochut, “Automatic Topic Labeling Using Ontology-Based Topic Models,” In Proc. IEEE International Conference on Machine Learning and Applications (ICMLA) ’14, 2015.
[Allahyari and Kochut, 2016a] M. Allahyari and K. Kochut, “Discovering Coherent Topics with Entity Topic Models,” In Proc. IEEE/WIC/ACM International Conference on Web Intelligence (WI), 2016, pp. 26-33.
[Allahyari and Kochut, 2016b] M. Allahyari and K. Kochut, “Semantic Context-Aware Recommendation via Topic Models Leveraging Linked Open Data,” In Proc. International Conference on Web Information Systems Engineering, 2016, pp. 263-277.
[Allahyari and Kochut, 2016c] M. Allahyari and K. Kochut, “Semantic Tagging Using Topic Models Exploiting Wikipedia Category Network,” In Proc. IEEE International Conference on Semantic Computing (ICSC) ’10, 2016, pp. 63-70.
[Allahyari et al., 2017] M. Allahyari, S. Pouriyeh, M. Assefi, S. Safaei, E. D. Trippe, J. B. Gutierrez, and K. Kochut, “Text Summarization Techniques: A Brief Survey,” International Journal of Advanced Computer Science and Applications(IJACSA), vol. 8, no. 10, 2017.
[Bahdanau et al., 2015] D. Bahdanau, K. Cho, and Y. Bengio, “Neural Machine Translation by Jointly Learning to Align and Translate,” In Proc. International Conference on Learning Representations (ICLR), 2015.
[Bengio et al., 2003] Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A Neural Probabilistic Language Model,” Journal of machine learning research, vol. 3, pp. 1137-1155, Feb 2003.
[Bojanowski et al., 2017] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching Word Vectors with Subword Information,” Transactions of the Association for Computational Linguistics, vol. 5, pp.135-146, 2017.
[Celikyilmaz and Hakkani-Tur, 2010] A. Celikyilmaz and D. Hakkani-Tur, “A Hybrid Hierarchical Model for Multi-Document Summarization,” In Proc. the 48th Annual Meeting of the Association for Computational Linguistics, 2010, pp. 9-12.
[Chali and Joty, 2008] Y. Chali and S. R. Joty, “Improving the Performance of the Random Walk Model for Answering Complex Questions,” In Proc. ACL-08: HLT, 2008, pp. 9-12.
[Chen et al., 2004] B. Chen, J.-W. Kuo, and W.-H. Tsai, “Lightly Supervised and Data-Driven Approaches to Mandarin Broadcast News Transcription,” in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, 2004.
[Chen et al., 2005] B. Chen, J.-W. Kuo, and W.-H. Tsai, “Lightly Supervised and Data-Driven Approaches to Mandarin Broadcast News Transcription,” International Journal of Computational Linguistics and Chinese Language Processing, vol. 10, no. 1, pp. 1-18, March 2005.
[Chen et al., 2015] X. Chen, L. Xu, Z. Liu, M. Sun, and H. Luan, “Joint Learning of Character and Word Embeddings,” In Proc. the 24th International Conference on Artificial Intelligence(IJCAI), 2015, pp. 1236-1242.
[Chen et al., 2016] Q. Chen, X. Zhu, Z. Ling, S. Wei, and H. Jiang, “Distraction-based neural networks for modeling documents,” In Proc. the Twenty-Fifth International Joint Conference on Artificial Intelligence(IJCAI), 2016, pp. 2754-2760.
[Cheng & Lapata, 2016] J. Cheng and M. Lapata, “Neural Summarization by Extracting Sentences and Words,” In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, 2016, pp. 484-494.
[Chien, 2015] J.-T. Chien, “Hierarchical Pitman-Yor-Dirchlet Language Model,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 8, pp. 1259-1272, 2015.
[Cho et al., 2014] K. Cho, B. Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation,” In Proc. the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1724-1734.
[Chopra et al., 2016] S. Chopra, M. Auli, and A. M. Rush, “Abstractive Sentence Summarization with Attentive Recurrent Neural Networks,” In Proc. NAACL-HLT, 2016, pp. 93-98.
[Chua and Asur, 2013] F. C. T. Chua and S. Asur, ”Automatic Summarization of Events from Social Media,” In Proc. International AAAI Conference on Web and Social Media(ICWSM), 2013.
[Climenson et al., 1961] W. D. Climenson, N. H. Hardwick, and S. N. Jacobson “Automatic syntax analysis in machine indexing and abstracting,” American Documentation, vol. 12, no. 3, pp.178-183, 1961.
[Collobort et al., 2011] R. Collobort, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, “Natural Language Processing (Almost) from Scratch,” Journal of Machine Learning Research, vol. 12, pp. 2493-2537, 2011.
[Conroy and O’leary, 2001] J. M. Controy and D. P. O’leary, “Text summarization via hidden Markov models,” In Proc. the 24th annual international ACM SIGIR conference on Research and development in information retrieval, 2001, pp. 406-407.
[Dang, 2008] H. T. Dang(ed.), NIST: Proceedings of the Text Analysis Conference, NIST, Gaithesburg, 2008.
[Daume and Marcu, 2006] H. Daumé III and D. Marcu, “Bayesian Query-Focused Summarization,” In Proc. the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, 2006, pp. 305-312.
[Deerwester et al., 1998] S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman, “Indexing by latent semantic analysis,” Journal of the American Society for Information Science, vol. 41, no. 6, pp. 391-407, 1990.
[Dunning, 1993] T. Dunning, “Accurate Methods for the Statistics of Surprise and Coincidence,” Computational Linguistics, vol. 19, no. 1, pp. 61-74, 1993.
[Edmundson, 1969] H. P. Edmundson, “New Methods in Automatic Extracting,” Journal of Association for Computing Machinery, vol. 16, no. 2, pp. 264-285, 1969.
[Erkan and Radev, 2004] G. Erkan and D. R Radev, “LexRank: Graph-based Lexical Centrality as Salience in Text Summarization,” Journal of Artificial Intelligence Research(JAIR), vol. 22, no. 1, pp. 457-479, 2011.
[Gong and Liu, 2001] Y. Gong and X. Liu, “Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis,” In Proc. the 24th annual international ACM SIGIR conference on Research and development in information retrieval, 2001, pp. 19-25.
[Hannon et al., 2011] J. Hannon, K. McCarthy, J. Lynch, and B. Smyth, “Personalized and Automatic Social Summarization of Events in Video,” In Proc. the 16th international conference on Intelligent user interfaces, 2011, pp. 335-338.
[Hirohata et al., 2005] M. Hirohata, Y. Shinnaka, K. Iwano, and S. Furui, “Sentence Extraction-based presentation summarization techniques and evaluation metrics,” In Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005.
[Hochreiter & Schmidhuber, 1997] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
[Hori et al., 2004] C. Hori, T. Hirao, and H. Isozaki, “Evaluation Measure Considering Sentence Concatenation for Automatic Summarization by Sentence or Word Extraction,” Text Summarization Branches Out, 2004, pp. 82-88.
[Huang & Wu, 2007] C.-L. Huang and C.-H. Wu, “Spoken Document Retrieval Using Multilevel Knowledge and Semantic Verification,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 15, no. 8, pp. 2551-2590, 2007.
[Jean et al., 2014] S. Jean, K. Cho, R. Memisevic, and Y. Bengio “On Using Very Large Target Vocabulary for Neural Machine Translation,” In Proc. the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2014, pp. 1-10.
[Kalchbrenner et al., 2014] N. Kalchbrenner, E. Grefenstette, and P. Blunsom, “A Convolutional Neural Network for Modeling Sentences,” in Proc. the 52nd Annual Meeting of the Association for Computational Linguistics, 2014, pp. 655-665.
[Kim et al., 2016] Y. Kim, Y. Jernite, D. Sontag, and A. M. Rush, “Character-Aware Neural Language Models,” In Proc. the Thirtieth AAAI Conference on Artificial Intelligence, 2016, pp. 2741-2749.
[Kim, 2014] Y. Kim, “Convolutional neural networks for sentence classification,” In Proc. the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1746-1751.
[Kupiec et al., 1995] J. Kupiec, J. Pedersen, and F. Chen, “A Trainable Document Summarizer,” In Proc. the 18th annual international ACM SIGIR conference on Research and development in information retrieval, 1995, pp. 68-73.
[Lei et al., 2015] T. Lei, R. Batzilay, and T. Jaakkola, “Molding CNNs for text: non-linear, non-consecutive convolutions,” In Proc. the 2015 Conference on Empirical Methods in Natural Language Processing, 2015, pp. 1565-1575.
[Lin, 2004] C.-Y. Lin, “ROUGE: a Package for Automatic Evaluation of Summaries,” In Proc. the Workshop on Text Summarization Branches Out (WAS 2004), 2004.
[Liu and Hakkani-Tür, 2011] Y. Liu and D. Hakkani‐Tür, “Speech Summarization,” in Spoken Language Understanding Systems for Extracting Semantic Information from Speech, G. Tur and R. D. Mori, Eds. John Wiley & Sons, 2011, pp. 357-396.
[Liu et al., 2006] Y. Liu, E. Shriberg, A. Stolcke, D. Hillard, M. Ostendorf, and M. Harper, “Enriching Speech Recognition with Automatic Detection of Sentence Boundaries and Disfluencies,” IEEE Transactions on audio, speech, and language processing, vol. 14, no. 5, pp. 1526-1540, Sept. 2006.
[Liu et al., 2015] S.-H. Liu, K.-Y. Chen, B. Chen, H.-M. Wang, H.-C. Yen, and W.-L. Hsu, “Combining Relevance Language Modeling and Clarity Measure for Extractive Speech Summarization,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 6, pp. 957-969, 2015.
[Luhn, 1958] H. P. Luhn, “The automatic creation of literature abstracts,” IBM Journal of research and development, vol. 2, no. 2, pp. 159-165, 1958.
[Mani et al., 2002] I. Mani, G. Klein, D. House, L. Hirschman, T. Firmin, and B. Sundheim, “SUMMAC: a text summarization evaluation,” Natural Language Engineering, vol. 8, pp. 43–68, 2002.
[Mihalcea and Tarau, 2004] R. Mihalcea and P. Tarau, “TextRank: Bringing Order into Texts,” In Proc. the 2004 Conference on Empirical Methods in Natural Language Processing, 2004, pp. 404-411.
[Mihalcea and Tarau, 2005] R. Mihalcea and P. Tarau, “A Language Independent Algorithm for Single and Multiple Document Summarization,” In Proc. IJCNLP, 2005.
[Na et al., 2014] L. Na, L. Ming-xia, L. Ying, T. Xiao-jun, W. Hai-wen, and X. Peng “Mixture of topic model for multi-document summarization,” In Proc. The 26th Chinese Control and Decision Conference (2014 CCDC), 2014, pp. 5168-5172.
[Nallapati et al., 2016] R. Nallapati, B. Zhou, C. Santos, C. Gùlçehre, and B. Xiang, “Abstractive Text Summarization using Sequence-to-sequence RNNs and beyond,” in Proc. the 20th SIGNLL Conference on Computational Natural Language Learning (CoNLL), 2016, pp. 280-290.
[Nallapati et al., 2017] R. Nallapati, F. Zhai, and B. Zhou, “SummaRuNNer: A recurrent neural network based sequence model for extractive summarization of documents,” in Proc. the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), 2017, pp. 3075-3081.
[Narayan et al., 2017] S. Narayan, N. Papasarantopoulos, M. Lapata, and S. B. Cohen, “Neural extractive summarization with side information,” CoRR abs/1704.04530, 2017.
[Narayan et al., 2018a] S. Narayan, S. B. Cohen, and M. Lapata, “Ranking Sentences for Extractice Summarization with Reinforcement Learning,” In Proc. NAACL-HLT, 2018, pp. 1747-1759.
[Narayan et al., 2018b] S. Narayan, R. Cardenas, N. Papasarantopoulos, S. B. Cohen, M. Lapata, J. Yu, and Y. Chang, “Document Modeling with External Attention for Sentence Extraction,” In Proc. the 56th Annual Meeting of the Association for Computational Linguistics, 2018, vol. 1, pp. 2020-2030.
[Nenkova and McKeown, 2012] A. Nenkova and K. McKeown, “A survey of text summarization techniques,” in Mining Text Data, C. C. Aggarwal, C. X. Zhai, Eds. Springer, 2012, pp. 43-76.
[Over et al., 2007] P. Over, H. Dang, and D. Harman. 2007. DUC in Context. Inf. Process. Manage. 43, 6 (Nov. 2007), 1506–1520.
[Papineni et al., 2002] K. Papineni, S. Roukos, T. Ward, and W. J. Zhu, “BLEU: a method for automatic evaluation of machine translation,”. In Proc. the 40th Annual meeting of the Association for Computational Linguistics, 2002, pp. 311–318.
[Paulus et al., 2017] R. Paulus, C. Xiong, and R. Socher, “A deep reinforced model for abstractive summarization,” CoRR abs/1705.04304, 2017.
[Pollock et al., 1975] J. J. Pollock and A. Zamora, “Automatic abstracting research at chemical abstracts service,” Journal of Chemical Information and Computer Sciences, vol. 15, no. 4, 1975.
[Radev et al., 2004] D. R Radev, H. Jing, M. Styś, and D. Tam, “Centroid-based summarization of multiple documents,” Information Processing & Management, vol. 40, no. 6, pp. 919-938, 2004.
[Ren et al., 2013] Z. Ren, S. Liang, E. Meji, and M. Rijke, “Personalized time-aware tweets summarization,” In Proc. the 36th international ACM SIGIR conference on Research and development in information retrieval, 2013, pp. 513-522.
[Ren et al., 2017] P. Ren, Z. Chen, Z. Ren, F. Wei, J. Ma, and M. Rijke, “Leveraging Contextual Sentence Relations for Extractive Summarization Using a Neural Attention Model,” in Proc. the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017, pp. 95-104.
[Rush et al., 2015] A. M. Rush, S. Chopra, and J. Weston, “A neural attention model for abstractive sentence summarization,” in Proc. the 2015 Conference on Empirical Methods in Natural Language Processing, 2015, pp. 379-389.
[Saggion & Poibeau, 2013] H. Saggion and T. Poibeau, “Automatic text summarization: Past, present and future,” in Multi-source, Multilingual Information Extraction and Summarization, T. Poibeau, H. Saggion, J. Piskorski, and R. Yangarber, Eds. Springer, 2013, pp. 3–21.
[Salton et al., 1997] G. Salton, A. Singhal, M. Mitra, and C. Buckley, “Automatic text structuring and summarization,” Information process and management, vol. 33, no. 2, pp. 193-207, 1997.
[See et al., 2017] A. See, P. J. Liu, and C. D. Manning, “Get to the point: Summarization with pointer-generator networks,” in Proc. the 55th Annual Meeting of the Association for Computational Linguistics, 2017, pp. 1073-1083.
[Shen et al., 2007] D. Shen, J.-T. Sun, H. Li, Q. Yang, and Z. Chen, “Document summarization using conditional random fields,” In Proc. the 20th international joint conference on Artifical intelligence, 2007, pp. 2862-2867.
[Steinberger et al., 2007] J. Steinberger, M. Poesio, M. A Kabadjov, and K. Jezek, “Two uses of anaphora resolution in summarization,” Information Processing & Management, vol. 43, no. 6, pp. 1663-1680, 2007.
[Sutskever et al., 2014] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” Advances in Neural Information Processing Systems , vol. 27, pp. 3104-3112, 2014.
[Sutton & Barto, 1998] R. S. Sutton and A. G. Barto, “Reinforcement Learning: An Introduction,” MIT Press, 1998.
[Takeshita et al., 1997] A. Takeshita, T. Inoue, K. Tanaka, “Topic-based multimedia structuring,” in Intelligent multimedia information retrieval, M. Maybury Eds.. Cambridge, MA: AAAI/MIT press, 1997.
[Tan & Wan, 2017] J. Tan and X. Wan, “Abstractive document summarization with a graph-based attentional neural model,” in Proc. the 55th Annual Meeting of the Association for Computational Linguistics, 2017, pp. 1171-1181.
[Tax, 2001] D. Tax, “One-Class Classification; Concept-Learning In The Absence Of Counter-Examples,” Technische Universiteit Delft, 2001.
[Torres-Moreno, 2014] J. M. Torres-Moreno, Eds., Automatic text summarization, John Wiley & Sons, 2014.
[Tsai et al., 2016] C.-I Tsai, H.-T. Hung, K.-Y. Chen, and B. Chen, “Extractive Speech Summarization Leveraging Convolutional Neural Network Techniques,” IEEE Spoken Language Technology Workshop (SLT), 2016.
[Vanderwende et al., 2007] L. Vanderwende, H. Suzuki, C. Brockett, and A. Nenkova, “Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion” Information Processing & Management, vol. 43, no. 6, pp. 1606-1618, 2007.
[vanDijk, 1980] T. A. van Dijk, Eds., Macrostructures: An interdisciplinary study of global structures in discourse, interaction, and cognition. Lawrence Erlbaum, Hillsdale, NJ 1980.
[Vinyals et al., 2015] O. Vinyals, M. Fortunato, and N. Jaitly, “Pointer Networks,” Advances in Neural Information Processing Systems, vol. 28, 2015.
[Wan and Yang, 2008] X. Wan and J. Yang, “Multi-document summarization using cluster-based link analysis” In Proc. the 31st annual international ACM SIGIR conference on Research and development in information retrieval, 2008, pp. 299-306.
[Wang et al., 2005] H.-M. Wang, B. Chen, J.-W. Kuo, and S.-S. Cheng, “MATBN: A Mandarin Chinese Broadcast News Corpus,” International Journal of Computational Linguistics & Chinese Language Processing, vol. 10, no. 2, pp. 219-236, 2005.
[Wang et al., 2009] D. Wang, S. Zhu, T. Li, and Y. Gong, “Multi-Document summarization using sentence-based topic models,” In Proc. the ACL-IJCNLP, 2009, pp. 297-300.
[Yih et al., 2007] W. Yih, J. Goodman, L. Vanderwende, and H. Suzuki, “Multi-Document Summarization by Maximizing Informative Content-Words,” In Proc. the 20th International Joint Conference on Artificial Intelligence, 2007.
[Zhang et al., 2015] X. Zhang, J. Zhao, and Y. Lecun, “Character-level convolutional networks for text classification,” Advances in Neural Information Processing Systems 28, pp. 649-657, 2015.
[Zhou et al., 2017] Q. Zhou, N. Yang, F. Wei, and M. Zhou, “Selective Encoding for Abstractive Sentence Summarization,” in Proc. the 55th Annual Meeting of the Association for Computational Linguistics, 2017, pp. 1095-1104.