Data-Triangulation Through Multiple Methods: The Case for Stealth Assessment

Data-Triangulation Through Multiple Methods: The Case for Stealth Assessment

DOI: 10.4018/978-1-6684-2468-1.ch005
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Rating scales are a traditional method for gathering social, emotional, and behavioral data. However, rating scales are at risk for validity problems associated with biased responding. A post-hoc analysis of scale data identified two groups within the sample: an anomalous response group and a non-anomalous response group. Findings indicated that those in the anomalous response group were more likely to rate themselves significantly differently on self-report rating scales measuring socially sensitive constructs. However, the two groups did not differ across the game-based behavioral stealth assessment measures (proactive aggression, reactive aggression, and prosocial skills). Findings support the need for triangulation and understanding of data through multiple methods of assessment. More directed research is needed to purposefully explore the validity of game-based assessments as part of a battery of measures.
Chapter Preview
Top

Introduction

Since the early 1930s, psychometricians have raised concerns regarding the impact of inaccurate responding on self-report instruments (Berneuter, 1933; Vernon, 1934). These concerns are grounded in the idea that inaccurate responses produce invalid scores due to respondent answers being based on something other than item content (Paulhus, 1991). This may introduce systematic error throughout a questionnaire resulting in validity concerns associated with data interpretation (van de Mortel, 2008; Tracey, 2016).

Inaccurate responses are often discussed in the literature as suspect responding. Suspect responding is broadly characterized as a variety of inaccurate responses due to factors such as item form (Osborne & Blanchard, 2011), social desirability (Barriga et al., 2001; Merydith et al., 2003), respondent’s incompetence (Shumway et al., 2004), random response choices (Walls et al., 2017), malingering (Catwright & Donkin, 2020), and/or cultural differences (Lee et al., 2002). Moreover, suspect responding can either occur as momentary responses to a situation (i.e., a response set) or it can occur as consistent responses across situations (i.e., a response style). Suspect responding (momentary) due to a response set can be determined by comparing data across multiple raters and/or multiple settings (Bensch et al., 2019); whereas suspect responding due to response style (systematic) is more difficult to determine.

In addition to multi-rater/multi-setting methods, psychometricians have long sought to mitigate both momentary and systematic suspect responses by using a variety of statistical techniques and instrument design methods. These other methods include rational, factor analytic, covariate, and demand reduction techniques (Bensch et al., 2019). Despite these tools, researchers continue to struggle with mitigating suspect responding using traditional techniques (Bensch et al., 2019; Kemper & Menold, 2014). As a result, some researchers have shifted towards the use of assessments that do not appear as assessments to the respondent. For example, test items may be embedded in a game-based environment. The assumption is that individuals who are unaware that they are being evaluated may be less likely to provide suspect responses. This has led to the innovative use of embedded assessments in game-based environments, known as stealth assessments (McCreery, et al, 2019).

Key Terms in this Chapter

Performance-Based Assessment: The direct measurement of constructs using artifacts and activities that resemble that which are being assessed.

Neutral Bias: Respondent tendency to choose the middle, or neutral, response across a set of items.

Validity: Collected evidence that supports the interpretation that scores of an assessment measure what they are intended to measure.

Extreme Responding: Respondent tendency to choose the endpoints of a Likert-type scale.

Random Responding: An unsystematic approach to responding where the respondent does not engage with the assessment.

Game-Based Assessment: Use of [video] games for the measurement of psychological constructs.

Evidence-Centered Design: An established framework used to align research-based constructs to game indicators via competency, evidence, and task models.

Stealth Assessment: Assessment embedded directly and invisibly into the learning or gaming environment.

Gameplay Codebook: Analytical tool specifically designed to code behavioral choices in game-based environments.

Complete Chapter List

Search this Book:
Reset