The Performance of GPT-3.5 in Summarizing Scientific and News Articles

Sabkat Arshad, Muhammad Yaqoob, Tahir Mehmood

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In the age of information, we are overwhelmed with large amounts of data. The quest to know more in less time has increased the need for efficient text summarization models that convert information into precise summaries such that essential details are not overlooked. Recently, GPT-3.5 has demonstrated impressive performance in text completion, generation, and question answering. However, its effectiveness in generating concise and coherent summaries for scientific articles and news reports remains under-explored. This work evaluates the performance of GPT-3.5 in summarizing scientific research articles and news data. Scientific articles were collected from arXiv STEM dataset, whereas news articles were sampled from the CNN/DailyMail dataset. Using the GPT-3.5 OpenAI API, the pre-trained model is prompted to generate summaries of the scientific and news articles. In the next step, the ROUGE score is computed for the generated summaries against the reference summaries to analyse the performance of the model. Our results show that GPT-3.5 performs slightly better in summarizing scientific articles as compared to news articles with an average ROUGE score of 0.35 and 0.31, respectively. Moreover, in agreement with the literature, we show that the ROUGE is not the best measure for evaluating text similarity as it heavily relies on similar vocabulary rather than semantics.
Original languageEnglish
Title of host publicationData Science and Emerging Technologies
Subtitle of host publicationProceedings of DaSET 2023
EditorsYap Bee Wah, Dhiya Al-Jumeily OBE, Michael W. Berry
Place of PublicationSingapore
PublisherSpringer Nature Link
Pages49-61
Number of pages13
ISBN (Electronic)978-981-97-0293-0, 978-981-97-0293-0
ISBN (Print)978-981-97-0292-3
DOIs
Publication statusE-pub ahead of print - 27 Apr 2024
EventThe International Conference on Data Science and Emerging Technologies DaSET 2023 - Virtual conference at UNITAR International University, Malaysia
Duration: 4 Dec 20235 Dec 2023
Conference number: 2
https://icdaset.com/daset2023/

Publication series

NameLecture Notes on Data Engineering and Communications Technologies
PublisherSpringer
Volume191
ISSN (Print)2367-4512
ISSN (Electronic)2367-4520

Conference

ConferenceThe International Conference on Data Science and Emerging Technologies DaSET 2023
Abbreviated titleDaSET 2023
Country/TerritoryMalaysia
Period4/12/235/12/23
Internet address

Keywords

  • ChatGPT
  • Large language model
  • Natural language processing
  • Scientific papers
  • Text summarization

Fingerprint

Dive into the research topics of 'The Performance of GPT-3.5 in Summarizing Scientific and News Articles'. Together they form a unique fingerprint.

Cite this