Unraveling the Inner Workings of Massive Language Models: Architecture, Training, and Linguistic Capacities

Unraveling the Inner Workings of Massive Language Models: Architecture, Training, and Linguistic Capacities

C. V. Suresh Babu, C. S. Akkash Anniyappa, Dharma Sastha B.
Copyright: © 2024 |Pages: 41
DOI: 10.4018/979-8-3693-3860-5.ch008
OnDemand:
(Individual Chapters)
Available
$33.75
List Price: $37.50
10% Discount:-$3.75
TOTAL SAVINGS: $3.75

Abstract

This study explores the evolution of language models, emphasizing the shift from traditional statistical methods to advanced neural networks, particularly the transformer architecture. It aims to understand the impact of these advancements on natural language processing (NLP). The study examines the core concepts of language models, including neural networks, attention, and self-attention mechanisms, and evaluates their performance on various NLP tasks. The findings demonstrate significant improvements in language modeling, especially in dialogue generation and translation. Despite these advancements, the study highlights the need to address ethical issues such as bias, fairness, privacy, and security for responsible AI deployment.
Chapter Preview

Complete Chapter List

Search this Book:
Reset