Text data is a common unstructured data type, and it is readily available and collectable due to the surge of social media, online encyclopaedia, and other text-based online platforms. Several pieces of literature have purposed studies on analysing text using Machine learning models for text classification tasks. Many state-ofthe-art studies employ complex neural network models, containing enormous model parameters to train, which requires extensive training time and hardware requirements. This study aims to use Latent Semantic Analysis (LSA)-a Natural Language Processing technique on text-based neural network models to investigate if we can reduce the complexity of the model (in terms of the number of trained parameters) without compromising text classification performance. The main objective behind this research is to reduce model complexity by applying LSA two different aspects of text-based neural network models. The first aspect is the input word vectors which are reduced to lower dimensions by applying two dimensional (2D) LSA and the second aspect is the output of the Embedding layer where the embeddings are converted into low-rank approximations using a threedimensional (3D) LSA before succeeding to the following neural layers, which means reducing the number of independent columns in input. Both this aspect is common among popular text classification models. This study investigated the combined impact of the above aspects on a text-classification neural network.
|Number of pages
|Published - 4 Nov 2022