Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Semantic Search on Unstructured Data: Explicit Knowledge through Data Recycling

Alex Kohn, François Bry, Alexander Manta

Source Title: International Journal on Semantic Web and Information Systems (IJSWIS) 6(2)

DOI: 10.4018/jswis.2010040102

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Studies agree that searchers are often not satisfied with the performance of current enterprise search engines. As a consequence, more scientists worldwide are actively investigating new avenues for searching to improve retrieval performance. This paper contributes to YASA (Your Adaptive Search Agent), a fully implemented and thoroughly evaluated ontology-based information retrieval system for the enterprise. A salient particularity of YASA is that large parts of the ontology are automatically filled with facts by recycling and transforming existing data. YASA offers context-based personalization, faceted navigation, as well as semantic search capabilities. YASA has been deployed and evaluated in the pharmaceutical research department of Roche, Penzberg, and results show that already semantically simple ontologies suffice to considerably improve search performance.

Article Preview

Top

Introduction

Nowadays most data produced in business is captured electronically and stored in computer systems. Search engines are of key importance in making this “hidden” information visible to the employees. Spoiled by the improvements in Web search, experts expect now a similar search performance in their intranet environment. However, current state-of-the-art enterprise search engines underperform (Feldman & Sherman, 2004). In effect, search for information becomes a central problem in companies.

A particularity of enterprise search is the lack of scientific publications. In case of commercial products, the information provided in booklets or white papers give only a vague picture of the applied algorithms. An aggravating factor is that the methods’ effectiveness in improving information retrieval in enterprise search is barely empirically investigated. Indeed, published methods often restrict to synthetic evaluations. Further, scientific publications often describe methods which are optimized for the Web but not for intranet environments. Lastly, papers addressing intranet search are often focused on the intranet web, ignoring the fact that file shares, e-mails, databases, applications, etc. are also part of an intranet which needs to be searched.

We conclude that search for information in intranet environments is theoretically and practically disappointing. The rising question is: why is search for information in the enterprise such a challenge?

Many reasons can be given (Fagin et al., 2003; Hawking, 2004): Heterogeneous data sources and formats, complex security permissions, less user observations, few or missing metadata, growing amounts of data, etc.

The World Wide Web is dominated by the hypertext protocol. This is in contrast to intranets, where only a small portion of the data is in a Web accessible format. This heterogeneity makes data integration a difficult task, as large portions of the intranet are not search engine friendly. Further, ranking of search results is made more difficult due to a different or missing linkage structure (Xue et al., 2003).

The complex security permissions present in companies are a mixed blessing. On the one hand side the information landscape is fragmented into many silos, i.e. any employee can only see a small subset of all data. On the other hand, ranking of search results is eased as only a subset of all data needs to be sorted by relevance. The degree of fragmentation depends of course on the company’s philosophy of information sharing across departments.

Observing a user’s search behavior enables search engines to detect the context of a user, which ultimately leads to personalization services (Micarelli et al., 2007). Such services are already part of the leading Web search engines. Offering personalization services in the enterprise however, is a difficult task due to the lack of feedback data: a few users are facing a lot of data.

Being confronted with barely explicit metadata at hand and mostly unstructured free-text documents represents another challenge. Therefore, it is difficult to offer semantic search capabilities – a problem, well known from the Internet.

Considering the mentioned challenges, the problem is how to improve search for information in the enterprise. Could integration, i.e. federated search, make the information landscape accessible? Could the ranking of search results be improved by applying facetted navigation or personalized search? Could high-quality metadata be obtained by applying automatic information extraction? Could domain knowledge (e.g., organizational charts, project databases, etc.) be used to set the searcher as well as the results in context?

Technically, we contribute by compiling and developing several approaches for facing the listed challenges, namely role-based adaptation, guided navigation, and incorporation of domain knowledge. The approaches are implemented into YASA (Your Adaptive Search Agent). YASA is deployed in the pharmaceutical research department of Roche in Penzberg.

Complete Article List

Search this Journal:

Reset

Volume 20: 1 Issue (2024)

Volume 19: 1 Issue (2023)

Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming

Volume 17: 4 Issues (2021)

Volume 16: 4 Issues (2020)

Volume 15: 4 Issues (2019)

Volume 14: 4 Issues (2018)

Volume 13: 4 Issues (2017)

Volume 12: 4 Issues (2016)

Volume 11: 4 Issues (2015)

Volume 10: 4 Issues (2014)

Volume 9: 4 Issues (2013)

Volume 8: 4 Issues (2012)

Volume 7: 4 Issues (2011)

Volume 6: 4 Issues (2010)

Volume 5: 4 Issues (2009)

Volume 4: 4 Issues (2008)

Volume 3: 4 Issues (2007)

Volume 2: 4 Issues (2006)

Volume 1: 4 Issues (2005)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Semantic Search on Unstructured Data: Explicit Knowledge through Data Recycling

Abstract

Introduction

Complete Article List