Article Preview
Top1. Introduction
Text classification is the most important feature of many applications such as document search, information extraction, and information retrieval in the legal analytics domain. The judicial entity of India is one of the most efficient bodies of Court Corporation, where the magistrates are free to conclude and resolve in just an intellectual way as stated by existing law. In general, a legal judgment text contains case law heading, information about lawyers, arguments, facts, decision, date and close connected information. Over the past few years, computers have begun to be a well-known part of the law court. India has a mixture of law domain like civil law, common law, and religious laws as stated by Do P. K., et al (2017). The hierarchical structure of courts in India is shown in Figure 1.
Lawyers must read these judgments and get points for their arguments. In some cases, the judgment may have more than 500 pages. So, Lawyers must spend more time reading all the required judgments. Usually, lawyers need information like issues, arguments, reasoning, statutes (acts and rules), a case cited (the Previous case referred), and decision details. (Commitment of the case).One of the major issues is extracting the relevant information from the unstructured documents because the judgment text maybe in an organized or disorganized form.
Kano, Y et al., (2018) performed legal data document classification, translation, summarization, document retrieval, and legal chatbot. Text transcription and document transcription are doing the same which helps to identify the type of cases and group the documents by category. Kannapala, A et al., (2019) To be specific, they summarize the judgment to understand what the case it is referring to. The authors, Elnaggar, A et al., (2018), Ma. Y et al. (2018), Verma, A et al., (2020) described document retrieval done on a Japan Court dataset. John, A.K et al., (2017) designed a legal bot using which lawyers can ask questions and get related documents. The legal bot applies Convolution Neural Network(CNN)and Support Vector Machine(SVM) learning algorithms using keywords. Deep learning has been widely used in the legal analytics domain, mainly focusing on three major fields such as legal document classification, information retrieval, and information extraction. In our proposed work, we focus on judgment text transcription from unstructured judgment text by adopting deep learning algorithms.
Law professionals typically segment the text related to legal data into six different categories including, issue, reason, arguments, fact, and decision. Our automation tool which we propose in this article will help to extract relevant information from each judgment and hence reduce the manual effort and cost. It is specially designed for Indian lawyers and law students. We consider10,000 judgment texts taken from the Indian Supreme court database for this study. We created a labeled dataset for 10000 judgment texts and used 9000for the training process and the remaining 1000 dataset for the testing process. Manual documents are used to train the model. The frequency of each word occurrence in the category serves as an input to the algorithm. For example, we are giving a 1000 issues category text. The machine will assign and tokenize the text and check the frequency of each word. The linguistic flexibility of the English language facilitates the data preprocessing easily. Our main contribution is the Indian Supreme Court, a dataset of legal judgment text manually labeled by legal team experts. We propose a neural network-based model to process the Indian legal data and compare it with various baseline models. Our automatic categorization system will help law professionals to refer to previous case details for their future arguments. Categorizing judgment text is a very complex task. This paper features the content in the following order. We describe a literature review that inspects the various approaches for text classification and word embedding techniques in section 2. Section 3 describes our proposed work and section 4 explains the experimental procedure and also the experimental environment and results in sec 5. Explain the findings in section 6. Finally, we conclude and state some future work in section 7.