An Approach of Renewable Energy Based on Machine Learning: OCR-Based Handwritten Character Datasets

Sumanta Kuila, Angana Chakraborty, Subhankar Joardar, Manasi Jana, Shradhanwita Mukherjee

Source Title: Machine Learning and Computer Vision for Renewable Energy

DOI: 10.4018/979-8-3693-2355-7.ch006

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Renewable energy comes from natural resources that replenish more quickly than they are used up. Compared with burning fossil fuels, producing renewable energy produces a lot fewer emissions. The key to solving the climate catastrophe is shifting away from fossil fuels, which presently provide the majority of emissions, and toward renewable energy. The facility has global impact and recent computer vision based on machine learning creates the business opportunity within all countries and regions globally. So there should be a communication between all countries and regions to exchange information between different global languages. In this study, an analytical survey was done to create the communication between countries and regions. For this purpose, different languages and datasets of the languages are studied where OCR-based character recognition can be done.

Chapter Preview

Top

1. Introduction

Energy derived from renewable resources that replenish naturally within a human timescale is referred to as low-carbon, green, or renewable energy. Light, wind, river flow, and geothermal heat are examples of renewable resources. More electrification is frequently implemented in tandem with renewable energy sources since electricity is clean at the time of use and can convey heat or items with efficiency. Renewable energy currently accounts for more than 20% of the energy supply in many countries worldwide, with some producing more than half of their electricity this way. There are a few nations that use renewable energy to produce all of their electricity. The 2020s and beyond are expected to see significant growth in the national markets for renewable energy. There are major economic, climate change mitigation, and energy security benefits from the deployment of renewable energy and energy efficiency technology. The growing popularity of renewable energy sources is being propelled by concerns about climate change as well as the ongoing decline in the price of certain renewable energy technology, such solar panels and wind turbines. These types of information should be communicated globally through different leading languages globally. In this study we made a comparative study between different languages and its existing databases. These existing databases will support the common communication environments between different languages. Making handwritten text machine readable is known as handwritten character recognition, or HCR. Handwriting styles vary greatly among writers, which poses a significant challenge to handwritten character recognition (HCR) systems. An approachable computer-assisted character representation is the goal of the handwritten character recognition system. The successful extraction of characters from handwritten documents and the digitization and translation of the handwritten text into machine-readable language are made possible by the character representation. In Devanagari script the upper part of each character features a horizontal line called Shirorekha, which means headline. Characters in English don't possess this quality. English may be derived from these scripts to this unique characteristic (Nawaz et al., 2018). In sequential handwriting, the shirorekha of one character links to the shirorekha of the preceding or following letter of the same word when written from left to right. Characters are recognized in the online character recognition system while they are still being created. The handwritten papers in the off-line character recognition system are created scanned, saved in a computer and subsequently identified with the use of character analysis. Here the handwriting recognition detects the characters in a photograph. Character recognition is the process of recognizing, differentiating and identifying characters in a picture. Computers are able to read handwritten characters but here human handwriting recognition is superior. Because the handwritten characters can have many various flavors and are not always perfect, the machine finds this work difficult. This difficulty can be solved using handwritten character recognition which uses an image of characters to identify the characters that are there. Character recognition is a technique for identifying, detecting and segmenting characters (alphabets) in images. The goal of character recognition in research is to make computers more proficient readers so they can interact with text in a way that is similar to that of humans. These are the key issues in machine learning in handwritten character recognition (HCR). The purpose of its use has been to test new machine learning algorithms with different train and test datasets (Dalal & Triggs, 2005; Newell & Griffin, 2011). This methodologies to work with handwritten character recognition aims to accurately recognize each handwritten character on a computer using an image input. The Convolutional neural network (CNN) model is an leading module that can recognize the English alphabet's images which makes the sizable chunk of the dataset that will be used to train the model. Handwritten characters or the English alphabet from A to Z will be developed in this machine learning project. Several popular artificial neural networks used for image recognition and processing is the neural networks were developed primarily to analyze pixel input. Unsolved computer vision challenge of great difficulty to read text from photos. Relevant data are read from various many real-world applications. Finding and identifying characters in an incoming digital image and converting them to a comparable machine-editable format is a crucial challenge for handwritten character recognition (HCR). Pattern recognition and image processing both see significant increase from it. The interpretation of data presents significant hurdles ranging from bank checks and language recognition to the conversion of handwritten documents into structured text formats. A neural network-based soft computing technique which has been the subject of extensive research with numerous theories and implemented algorithms is used by the handwritten character recognition system. Character recognition feature extraction is accomplished by utilizing a novel method called diagonal-based feature extraction. Preprocessing is done once the scanned text document image has been obtained. The paths to obtain the input photos and store the output variables have now been set and all variables and counters have been initialized. Unwanted outliers and noise have all been eliminated. Its skew is adjusted, and the picture is made regular. Moreover the binary picture is obtained by converting the RGB image to a grayscale image. Utilizing the project profiles a pre-processed binary image is obtained and further segmented to extract lines, words, and characters. In order to identify and find every line in the bounding boxes, the image is first segmented using a horizontal projection profile. This is carried out because text lines have a higher density of black pixels than the spaces between neighboring lines. After this type of line identification, a vertical projection profile is used to identify and recognize words within bounding boxes. The geometric features of a character are retrieved after a character geometrical structure consumes a number of black pixels, each of which is represented by a corresponding value. Research into newer strategies that could assist improve recognition accuracy is still ongoing in the off-line handwriting recognition field. It's because a number of applications such as postal address recognition, bank processing, document reading, and mail sorting need offline handwriting recognition systems. In this research hand written datasets of different languages are discussed and its features are studied. The popular languages like English, Hindi, Bengali, Arabic etc. have existing datasets which are used for research, academic and commercial purposes. This study aims to deliver a comparative study among these languages their properties, features also merits and demerits of these datasets.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

An Approach of Renewable Energy Based on Machine Learning: OCR-Based Handwritten Character Datasets

Abstract

1. Introduction

Complete Chapter List