Extractive Text Summarization Methods in the Spanish Language

Extractive Text Summarization Methods in the Spanish Language

Irvin Raul Lopez Contreras, Alejandra Mendoza Carreón, Jorge Rodas-Osollo, Martiza Concepción Varela
DOI: 10.4018/978-1-7998-4730-4.ch018
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The quantity of information in the world is increasing every day on a fast level. This fact will be an obstacle in some situations; text summarization is involved in this kind of problem. It is used to minimize the time that people spend searching for information on the web and in a lot of digital documents. In this chapter, three algorithms were compared; all of them are an extractive text summarization algorithm. Popular libraries that influence the performance of these kinds of algorithms were used. It was necessary to configure and modify these methods so that they work for the Spanish language instead of their original one. The authors use some metrics found in the literature to evaluate the quality and performance of these algorithms.
Chapter Preview
Top

Introduction

Today, the information is growing fast every day; for that reason, it is important to have tools that can deal with some issues like the problem of search quality information in a fast way. This information can be used to solve a problem, to understand how something works, among other things. This information is often found in web pages, articles, books, among others, which may contain an extensive amount of text that makes what is important be lost in the content.

There are algorithms that seek to summarize the text based on certain summarization techniques; many of these techniques use Python libraries to perform this function. The two basic strategies used in the literature are extractive and abstractive. In this chapter, a review of some algorithms that focus on extractive techniques is presented. There are several areas that can take advantage of these tools, such as journals, magazines, newspapers, social networks, etc.

The present work has as main objective to analyze, according to metrics found in the literature, the operation of algorithms used in the text summarization field on text from the computer science field in the Spanish language. The algorithms that were implemented were previously used in studies found for texts in the English language. For these algorithms to work for the Spanish language, the obstacles that appear in this language should be satisfactorily faced. This chapter contains a report of the results of three algorithms based on extractive text summarization working with Spanish texts.

The proposed study consists of the following main steps:

  • 1.

    Find some algorithms of text summarization that work on the English language (or another language).

  • 2.

    Configure the algorithms to run in the Spanish language.

  • 3.

    Evaluate the performance of the algorithms.

Currently, there are several algorithms in the literature that allows users to summarize texts in the English language, but unfortunately, there are a few for the Spanish language. It is for this reason that we consider that the present work could help to explore the area of text summarization in the Spanish language.

The Spanish language can present grammar problems that have to be considered in the text summarization algorithms, which can affect their performance. The work to be done compares the results obtained from the algorithms according to the language, so it is intended to contrast the operation of one algorithm with another and, in this way, identify which of those approaches can work best in the Spanish language.

The importance of this lies in the considerations that must be considered so that the algorithm performs the task correctly and subsequently; also, a comparison of their performance can be made. For the evaluation, parameters found in the literature were used. The result of the application of the algorithm must make sense and not lose the direction of what is in the content of the text.

This research could contribute to the development of algorithms for extracting information from texts in the Spanish language. This contribution will allow a better algorithm selection when using any of those utilized in the research to obtain better results according to the text to be summarized.

Top

Background

This section will present the background of text summarization, as well as related studies that have dealt with these algorithms with the Spanish language or other than the English language.

Complete Chapter List

Search this Book:
Reset