Author: Zaki,Amr Mahmoud/ Title: Abstractive Auto Text Summarization using deep learning techniques \

Search In this Thesis

العنوان

Abstractive Auto Text Summarization using deep learning techniques \

المؤلف

Zaki,Amr Mahmoud

هيئة الاعداد

باحث / عمرو محمود زكى

مشرف / حازم محمود عباس

مشرف / محمود إبراهيم خليل

مناقش / محمد وليد فخر

تاريخ النشر

2020.

عدد الصفحات

122p.:

اللغة

الإنجليزية

الدرجة

ماجستير

التخصص

الهندسة الكهربائية والالكترونية

تاريخ الإجازة

1/1/2020

مكان الإجازة

جامعة عين شمس - كلية الهندسة - كهرباء حاسبات

الفهرس

Only 14 pages are availabe for public view

from

141

from

141

Abstract

AMR MAHMOUD ZAKI. Abstractive Auto Text Summarization using deep learning techniques. (Under the direction of Prof. Dr. Hazem M. Abbas and Prof. Dr. Mahmoud I. Khalil).
Text summarization is the task of generating a summary from a long text. Extractive Methods approaches have been rst proposed by literature as statistical methods for this task. However, these models lack some sophisticated abilities in summarization, like paraphrasing and generalization. Newer methods were proposed which are based on neural approaches, called Abstractive Methods. They actively try to understand the context of the text to generate novel summaries. In this work we are studying some of these approaches.
Multiple Abstractive Methods were recently proposed by literature, they rely on a basic framework named encoder-decoder. Multiple baselines were proposed for the encoder-decoder, namely, seq2seq recurrent based models, and the transformer models. Our research tends to study the problems that this encoder-decoder framework suffers from. The seq2seq recurrent based model was chosen as a baseline in our work, as it was thoroughly studied in literature in order to minimize the encoder-decoder problems. This baseline is built using LSTM in an encoder-decoder architecture, with attention. However this baseline suffers from some problems. This work goes through multiple models to try and solve these problems. Beginning with Pointer-Generator, to using a curriculum learning approach called Scheduled-Sampling. Then our research applies the new approaches of combining reinforcement learning with seq2seq.
This research begins with applying the models to the English language. Then it introduces these models to the Arabic language. This work also introduced a new novel method of working with the Arabic language being an agglutinative languages, which is a pre-processing technique that is applied to the dataset which increases the relevance of the vocabulary, which effectively increases the efficiency of the text summarization without modifying the models, this method can be applied to any other agglutinative language.
Believing in the ease of reproducibility of our research, and in enriching the deep learning open-source community, our models were optimized to run on Google colab (a free online environment offering powerful GPU/large RAMs), and the data have been hosted on
google drive to seamlessly connect to google colab.1 2