Psychology 121, Lecture 2

by Hal S. Kopeikin, Ph.D. © 2000


Overview

Today's lecture introduces some major concepts in measurement. Ideas broached today will be developed, extended, and referred to throughout the quarter. Psychological tests can generally be divided into measures of maximal performance or typical performance. Reliability and validity are central characteristics in judging the usefulness of tests. Norms provide a context essential to the interpretation of test scores. Statistics are mathematical techniques used to describe norms and test results.

Today's Lecture: Basic Concepts in Measurement

Two Types of Tests

  1. Maximal Performance tests attempt to measure what a person can do. Instructions for these test include "do your best. Although all tests measure current performance, these test have different temporal foci:
  1. Typical Performance tests examine what a person is like. They usually include instructions to be truthful and objective. Tests of career interests and personality traits are examples of typical performance measures.

Reliability and validity

All good tests are reliable and valid. But some caveats are needed here: First, reliability and validity are a matter of degrees. The more reliable and valid, the better the test. Second, tests are only good relative to the purpose of the test, the people being tested, and the circumstances of the test. Thus, the reliability and validity of a test vary.

Reliability

Some test have two or more forms, so reliability across forms can be measured. Alternatively a single form can be divided into two smaller subtests, and we check for consistency between the two half-tests. Note, however, that the subtest are shorter than the test as a whole, which generally reduces there reliability, so we need to make a statistical adjustment.

Validity

    1. Content validity: The defined content domain is well sampled.
    2. Criterion-Related validity: The test is a good predictor. E.g., the SAT is supposed to have a moderately high level of criterion-related validity because it is a good predictor of achievement in college.
    3. Construct validity: Theories are only models of the world. Construct validity is relevant to assessing the adequacy of a measure intended to capture a theoretical abstraction.

Norms

Statistics Describing Scores and Populations of them

Scales of Measurement and the meaning of numbers (cf. Table 2-1. p. 32). Measurement scales are numbering systems specifying properties which determine what the numbers signify. The properties involve concepts such as magnitude, interval size, and absolute zero . Use and interpretation of the numbers depend on these underlying properties. For example, while it makes sense to describe me as twice as tall as my kid, it probably does not to say I'm twice as worried. Here are four types of numbering system, ordered from least to most informative. Frequency Distributions (cf. Figures 2-2, 2-3, 2-4, pp. 35-37)

are graphs showing the incidence of scores in a sample or population. Scores (the x axis) are divided into equal segments (the class interval). Histograms are traditionally used for discrete data, whereas frequency polygons summarize continuous data; this distinction is often somewhat arbitrary, but reflects whether the x axis indicates groups or amounts.