Hershey, Pennsylvania

New York, New YorkBeijing, China

Special Offers
- Up to 50% off Thousands of Research Books
  From July 1st through October 31st, 2025, we are offering discounts of up to 50% across thousands of titles in Business & Management; Science, Technology, & Medicine; and Education & Social Sciences. Through this campaign, we’re committed to ensuring that our mutual library customers worldwide can continue to access high-quality, peer-reviewed content during these challenging times. If this campaign is successful, we will extend through the end of the year and beyond if there’s a benefit to all parties involved. When hosted on the InfoSci^® Platform, e-books feature no DRM, no additional cost for unlimited-user licensing, full-text PDF & HTML formats, and more. Discount is automatically added at checkout.
  Browse Titles
- IGI Global Scientific Publishing Launches International Brand Ambassador Program
  IGI Global Scientific Publishing has launched a new Ambassador Program, designed to empower research professionals to help spread scholarly resources and foster global research engagement. As a local, mid-sized publisher, this initiative offers IGI Global Scientific Publishing an exciting opportunity to expand its global presence in the academic community and foster meaningful connections among scholars around the world. With currently over 130 ambassadors worldwide, these scholarly experts are dedicated to supporting the publisher’s initiative of disseminating cutting-edge research.
  Learn More
- Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 20 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no hosting or maintenance fees, no additional cost for unlimited-user licensing, full-text PDF & HTML format, and more.
  Learn More
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all available IGI Global Scientific Publishing open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all available IGI Global Scientific Publishing open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through the IGI Global Scientific Publishing Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global Scientific Publishing to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open access endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global Scientific Publishing to publish your work under open access? Review the IGI Global Scientific Publishing open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Tapering Malicious Language for Identifying Fake Web Content

Shyamala Devi N. (Vels Institute of Science, Technology, and Advanced Studies, Chennai, India) and Sharmila K. (Vels Institute of Science, Technology, and Advanced Studies, Chennai, India)

Source Title: Using Computational Intelligence for the Dark Web and Illicit Behavior Detection

DOI: 10.4018/978-1-6684-6444-1.ch011

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

The neoteric occurrence, the pandemic, and global crisis entails the extensive use of web portals to unfurl information. While this has built the cognizance of the common man, the infinitely unnoticed enumeration of malicious content on the web has escalated copiously. Spurious data and fake information has done more harm than what is actually unraveled to the public; however, scrupulously meticulous measures to agonize their source and delve into mitigating these data has become quite a challenge. This indignation delves into step-wise analysis of identifying the hoax through systematically programmed algorithms using natural language processing.

Chapter Preview

Top

Introduction

The proposed methodology for identifying the malicious fake web content comprises of text mining from the malicious web site .The tools and techniques of Natural language processing and the deep learning techniques are applied in the methodology to detect the Malicious web content and the framework is represented in the below figure 1. Classify the Malicious fake Web content

Figure 1.

Text analysis and the process of detecting fake content require an elaborate approach due to the intricacies it involves. The initial process incorporates the potently anatomized data to be obtained. This datasets holds a commingled data of authentic and spurious content for further processing. This datasets necessitates the method of web scrapping method that is explicated for extracting the appropriate data from the gargantuan web content that is available. The next phase of processing incorporates the scrapped texts which are a single clustered large embodiment of text to be separated into individual tokens. This process is effectively implemented using lexical parsing which scans the text clusters to effectively transform them into sequence of tokens. The next step is stop word removal in order to spurious over the spurious content and it is effectuated. Now that the content holds adequate correlation, a certain amount of pre-processing done through normalization is to scale the data is implemented. This process of normalization is utilized through the method of stemming and lemmatization that help in obtaining relevant content for Part of Speech(POS) tagging. Then the Subsequent approach of vectorization of text is words Vectorization in order to construct the environment ready for effective detection of anomalies. The final approach is to identify and extract this malicious content is executed through the method of BERT (Bidirectional encoder from transformer) in order to procure the associative correlation between the words and render an augmented precision of the web hoax that is disseminated. Thus the relationship behaviours between a text thereby evincing the malicious entailed web content through the series of we processing simulated using python programming.

Top

Platform Used For Identifying The Fake Web Content

Spyder (Python 3.6)

Spyder, the Scientific Python Development Environment, is a open itegerated development environment (IDE) that is incorporated with Anaconda. It incorporates altering, intelligent testing, troubleshooting, and reflection highlights. After you have introduced Anaconda, start Spyder on Windows, macOS, or Linux by running the spyder. Spyder is additionally pre-introduced in Anaconda Navigator, which is remembered for Anaconda. On the Navigator Home tab, click the Spyder symbol.

Jupyter Notebook

The Jupyter Notebook is an open source web application that you can use to make and share archives that contain live code, conditions, perceptions, and text. Jupyter Notebook is kept up with by individuals at Project Jupyter.

Jupyter Notebooks are a side project from the IPython project, which used to have an IPython Notebook project itself. The name, Jupyter, comes from the center upheld programming dialects that it upholds: Julia, Python, and R. Jupyter ships with the IPython part, which permits you to compose your projects in Python.

Top

Website Scrapping

Web scraping is a process of collecting the data from the website using the application programming interface. For the process of extraction the python code is written for querying a webserver and requesting the data from the web page which extract the data needed.

Initial steps to install the python beautiful soup for scraping the website

Beautiful Soup is a Python library for hauling information out of HTML and XML documents. It works with your cherished parser to give informal methods of exploring, looking, and altering the parse tree. It regularly saves developers hours or long stretches of work

1.
Building the web scrapper
- ●
  Installing the beautiful soup
  - Beautiful soup is a python library used for web scrapping.
  - The basic method for installing in Linux platform
  - $ sudo apt-get install python-bs4
  - For the Macs
  - $ sudo easy_install pip beautifulsoup

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference