Chapter Preview
TopBackground
Big Data was originally described by the 3Vs (Laney, 2001), but Kaisler, Armour, Espinosa, and Money (2013) have suggested two more.
Table 1. V | Description |
Data Volume | The amount of data collected and available. It is estimated that over 2.5 Exabytes (1018) of data are created every day as of 2012 (Wikipedia, 2013). |
Data Velocity | The rate at which data is accumulated or the speed at which the data arrives, and how quickly it gets purged, how frequently it changes, and how fast it becomes outdated. |
Data Variety | The types of data required for analysis, either structured, such as RDF files, databases, and Excel tables or unstructured, such as text, audio files, and video. |
Data Value | The value derived from processing the data that contributes to decision making and problem solving. A large amount of data may be valueless if it is perishable, late, imprecise, or has other weakness or flaw. |
Data Veracity | The accuracy, precision and reliability of the data. A data set may have very accurate data with low precision and low reliability based on the collection methods and tools. |
Big Data has been often used to represent a large volume of data of one type, such as text or numbers or pixels. Recently, many organizations are creating blended data from data sources with varied types through analysis. These data come from instruments, sensors, Internet transactions, email, social media such as Twitter, YouTube, Reddit, Pinterest, Tumblr, and clickstreams. New data types may be derived through analysis or joining different types of data.
Key Terms in this Chapter
Data Variety: The number of relations and interdependencies among the data in one data set.
Big Data: The volume of data that is just beyond technology’s capability to store, manage and process efficiently.
Data Volume: The amount of data collected and, perhaps, available for use.
Data Velocity: The rate at which data is accumulated or streams into a collection area.
Data Veracity: The accuracy, precision and reliability of the data.
Data Curation: The extraction, moving, cleaning, and preparing of data for storage and processing.
Data Enrichment: The process of augmenting collected raw data or processed data with existing data or domain knowledge to enhance the analytic process.
Data Value: The value that derived from processing the data using different analytics that contributes to problem solving.