Data Transcription for India's Supreme Court Documents Using Deep Learning Algorithms

Data Transcription for India's Supreme Court Documents Using Deep Learning Algorithms

Vaissnave V., P. Deepalakshmi
Copyright: © 2020 |Pages: 21
DOI: 10.4018/IJEGR.2020100102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The Indian legal system is one of the largest judiciary systems in the world and handles a huge number of legal cases which is increasing rapidly day by day. The computerized documentation of Indian law is highly voluminous and complex forms. This article proposes a model using deep learning techniques to split the judgment text into the issue, facts, arguments, reasoning, and decision. To evaluate the proposed model, the authors conducted experiments that revealed that the convolutional neural network and long short-term memory transcription technique could achieve better accuracy and obtain superior transcription performance. Comparison results indicate that the proposed algorithm gives the highest classification accuracy rate of 95.6%. The adaptation of splitting the judgment text into the issue, facts, arguments, reasoning, and decision helps to find specific portions of the judgment within a second, making the job of analyzing the case more effective, efficient, and faster.
Article Preview
Top

1. Introduction

Text classification is the most important feature of many applications such as document search, information extraction, and information retrieval in the legal analytics domain. The judicial entity of India is one of the most efficient bodies of Court Corporation, where the magistrates are free to conclude and resolve in just an intellectual way as stated by existing law. In general, a legal judgment text contains case law heading, information about lawyers, arguments, facts, decision, date and close connected information. Over the past few years, computers have begun to be a well-known part of the law court. India has a mixture of law domain like civil law, common law, and religious laws as stated by Do P. K., et al (2017). The hierarchical structure of courts in India is shown in Figure 1.

Figure 1.

Hierarchy of courts

IJEGR.2020100102.f01

Lawyers must read these judgments and get points for their arguments. In some cases, the judgment may have more than 500 pages. So, Lawyers must spend more time reading all the required judgments. Usually, lawyers need information like issues, arguments, reasoning, statutes (acts and rules), a case cited (the Previous case referred), and decision details. (Commitment of the case).One of the major issues is extracting the relevant information from the unstructured documents because the judgment text maybe in an organized or disorganized form.

Kano, Y et al., (2018) performed legal data document classification, translation, summarization, document retrieval, and legal chatbot. Text transcription and document transcription are doing the same which helps to identify the type of cases and group the documents by category. Kannapala, A et al., (2019) To be specific, they summarize the judgment to understand what the case it is referring to. The authors, Elnaggar, A et al., (2018), Ma. Y et al. (2018), Verma, A et al., (2020) described document retrieval done on a Japan Court dataset. John, A.K et al., (2017) designed a legal bot using which lawyers can ask questions and get related documents. The legal bot applies Convolution Neural Network(CNN)and Support Vector Machine(SVM) learning algorithms using keywords. Deep learning has been widely used in the legal analytics domain, mainly focusing on three major fields such as legal document classification, information retrieval, and information extraction. In our proposed work, we focus on judgment text transcription from unstructured judgment text by adopting deep learning algorithms.

Law professionals typically segment the text related to legal data into six different categories including, issue, reason, arguments, fact, and decision. Our automation tool which we propose in this article will help to extract relevant information from each judgment and hence reduce the manual effort and cost. It is specially designed for Indian lawyers and law students. We consider10,000 judgment texts taken from the Indian Supreme court database for this study. We created a labeled dataset for 10000 judgment texts and used 9000for the training process and the remaining 1000 dataset for the testing process. Manual documents are used to train the model. The frequency of each word occurrence in the category serves as an input to the algorithm. For example, we are giving a 1000 issues category text. The machine will assign and tokenize the text and check the frequency of each word. The linguistic flexibility of the English language facilitates the data preprocessing easily. Our main contribution is the Indian Supreme Court, a dataset of legal judgment text manually labeled by legal team experts. We propose a neural network-based model to process the Indian legal data and compare it with various baseline models. Our automatic categorization system will help law professionals to refer to previous case details for their future arguments. Categorizing judgment text is a very complex task. This paper features the content in the following order. We describe a literature review that inspects the various approaches for text classification and word embedding techniques in section 2. Section 3 describes our proposed work and section 4 explains the experimental procedure and also the experimental environment and results in sec 5. Explain the findings in section 6. Finally, we conclude and state some future work in section 7.

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 1 Issue (2023)
Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing