Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Mathematical Information Retrieval Trends and Techniques

Pankaj Dadure, Partha Pakray, Sivaji Bandyopadhyay

Source Title: Deep Natural Language Processing and AI Applications for Industry 5.0

DOI: 10.4018/978-1-7998-7728-8.ch005

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Mathematical formulas are widely used to express ideas and fundamental principles of science, technology, engineering, and mathematics. The rapidly growing research in science and engineering leads to a generation of a huge number of scientific documents which contain both textual as well as mathematical terms. In a scientific document, the sense of mathematical formulae is conveyed through the context and the symbolic structure which follows the strong domain specific conventions. In contrast to textual information, developed mathematical information retrieval systems have demonstrated the unique and elite indexing and matching approaches which are beneficial to the retrieval of formulae and scientific term. This chapter discusses the recent advancement in formula-based search engines, various formula representation styles and indexing techniques, benefits of formula-based search engines in various future applications like plagiarism detection, math recommendation system, etc.

Chapter Preview

Top

Introduction

Mathematics is a significant factor in the field of science, technology, engineering, and mathematics (STEM) (Greiner-Petter A. A., 2020). Without a single mathematical expression or symbol, a scientific text is often available. In this digital world, with growing numbers of teaching and learning materials being produced, the explosion of knowledge was indeed inevitable. In the last decade, new techniques, concepts, and tools were created to store, maintain and retrieve this vast array of scientific records. In order to ensure, the users can easily access the information according to their information needs, the information needs to be organized and represented in the most efficient way.

Information retrieval (IR) is a subfield of natural language processing (NLP) that aims to retrieve the needed information from the collection of documents. The general IR system takes the user’s query as an input, works on the similarity, and based on that returns the rank of relevant documents (Buttcher, 2016). This is a common methodology used by today's retrieval system like Google search, PubMed, or Apple's Spotlight system. Nowadays, most of the available data on the web is sequential text data. Besides, the demand of the users may change: sometimes users may search for image/video data based on the text data, text based on the image data, based on the cause user’s looking for effect related documents, some users interested in the linguistically structured documents. In some cases, users are unsure about what exactly they are looking for. To achieve these, several preprocessing operations have been investigated depend on the domain and the user's requirements (Virmani, 2019). Almost all the retrieval systems are specially written programs, as long as researchers can explain their methodology, can be done for particular types of data. Moreover, the domain of information retrieval is explored since the early 1950s, and as a result, many IR models come into the limelight which mainly lies on the boolean model, vector space model, and probabilistic model. The field of textual information retrieval has been extensively investigated for many years, but mathematical information retrieval (MIR) (Hu, 2013) requires distinctive attention since traditional text recuperation systems cannot retrieve mathematical expressions. The mir systems are formula-based search engines that assist to search for knowledge in mathematical documents. The prime aspect of these MIR systems is to retrieve mathematical formulae which are relevant to a queried formula. In this task, the term ‘relevant’ encompasses two meaning: first considered the structural similarities to query formula, and the second one considered the conceptual meaning of the formula. Each finds not only formulas that are the exact match of the query formula, but also those which share similarities with it. For example, a retrieved result might contain only part of the query equation or might append terms.

Formulas found in web pages are mainly encoded in latex and/or MathML format. The traditional text-based search engines ignore the structure of these encodings by treating formula as normal text. This creates obstacles for a search engine to retrieve the relevant documents due to bounded structural information about the formula in the search index. In terms of query generation, this is a challenging task for an unfamiliar user with latex or MathML. Also, a recent study confirms that presenting raw math encodings in search results can adversely affect the accuracy of relevance assessment for search hits.

Figure 1.

Interaction of mathematical formula and their context

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Mathematical Information Retrieval Trends and Techniques

Abstract

Introduction

Complete Chapter List