Psychology 121, Lecture 9
by Hal S. Kopeikin, Ph.D. © 2000
Interviews & Measuring Intelligence
A definition: An interview is a conversation with a purpose.
General Characteristics of Interviews
Structure
Structured interviews are standardized, much like tests. They have
pre-designated questions, sequence, even scoring. The interviewer is directive,
steering the conversations in predetermined directions. Standardization
facilitates comparison of individuals by providing a common focus and metric.
Research suggests structure can improve reliability and validity.
Unstructured interviews have a free flowing, spontaneous quality.
Typically, the interviewer is relatively nondirective, following the interviewee's
lead. Open-ended questions are the norm and "scoring" is informal. Such
interviews can be ideographic, sensitive to individual uniqueness, flexible.
Rapport
Interviewees are typically more disclosing, honest, and responsive when
their relationship with the interviewer is warm and comfortable. Interviewers
will therefore make attempts to be overtly pleasant, respectful, interested,
and nonjudgmental. They will facilitate communication with appropriate
responses and nonverbal behavior. The major exception to this the stressinterview,
where examiners intentionally create a challenging or threatening interpersonal
context to assess the interviewee's reactions to such situations.
Interactive
Interviews are interaction between two people, effected by both. Interview
outcomes thus depend on characteristics of both participants and their
interaction. Influence is reciprocal. Interviews are adaptive, so responses
determine the direction of subsequent explorations.
Types of Verbal Responses.
Various classificatory systems have been proposed to distinguish interviewing
techniques. Some common categories include:
-
Questions: Open-ended (respondents have a lot of latitude in answering)
& Closed-Ended (respondents have little latitude in their answers).
-
Paraphrases and Reflections (restatements of content or feelings);
these are more commonly used by experienced interviewers.
-
Confrontations and Interpretations (highlight contradictions
or offer new perspectives); these responses bring something new into the
awareness of the responder. Often these are anxiety-provoking responses.
-
Acknowledgment / transitional phrases (I see, uh huh);
essentially, these are social facilitators, content-free expressions.
-
Evaluations (praise, criticism); these responses cast a value judgment.
These are often considered no-no's because they can skew the information
that is subsequently provided by the respondent.
-
Reassurance, e.g., 'Everything is okay.'; These responses are also
considered no-no's because they can skew information. They may also invalidate
the interviewee's feelings and attitudes.
Common Types of Interviews
-
Employment / Selection Interviews tend to be structured, usually oriented
towards uncovering negative information. Predictive validity is often unimpressive.
-
Case History Interviews aim construct a chronological account of a person's
family, social, educational, vocational, and/or medical development. Can
be more narrowly focus, e.g. on the development of a phobia.
-
Mental Status Exam attempts to summarize current psychiatric functioning:
Orientation (Who are you? Where are you? When is it?), Sensorium (perception),
Affect, Attention, Appearance, Cognition, and Interpersonal domains are
usually addressed.
-
Structured Clinical Interviews use a specific set and sequence of questions,
operationally define terms, and offer scoring procedures. These are interviews
conducted like tests. They are standardized as are test.
-
Assessment Interviews are less structured, more conversational approaches
to exploring or examining psychological functioning, interests, symptoms,
defensive and coping styles, beliefs, attitudes, etc. Usually they are
broadly focused.
Interview Reliability & Validity
-
These vary tremendously based on the interviewer's skill, type of interview,
and method of assessment.
-
Structured interviews are rather like tests, with similar strengths and
weaknesses. In fact, with computerized administration, the line between
them is blurring.
-
Unstructured interviews are harder to evaluate. Flexibility means a focus
on different topics from interview to interview, so what would reliability
mean? How about validity? When evaluated by traditional psychometric means,
unstructured interviews fare poorly--is that a problem with the interviews
or the criteria?
-
Interviews and tests offer complimentary views and each may compensate
for weaknesses of the other.
-
Critics of tests frequently fail to realize that interviews are usually
more subject to bias and naive errors.
Intelligence Tests
Intelligence is a construct, i.e., a theoretical abstraction. Theories
are not Truth, merely attempts to represent or approximate truths.
Some Definitions of Intelligence
-
Binet : "the tendency to take and maintain a definite direction; the capacity
to make adaptations for the purpose of attaining a desired end; and the
power of autocriticism"
-
Spearman: "adjustment or adaptation of the individual to his total environment...ability
to learn...ability to carry on abstract thinking"
-
Das: "the ability to plan and structure one's behavior with an end in view"
-
Gardner: ability to "resolve genuine problems or difficulties as they are
encountered"
-
Sternberg: "mental activities involved in purposive adaptation to, shaping
of, and selection of real-world environments relevant to one's life"
Models of Intelligence
-
General Intelligence, Spearman's g and S1...Sn
g stands for general mental ability, a kind of general mental energy.
S1...Sn represent particular types or categories of intelligence. The idea
is that people have a basic level of general intelligence and this is expressed
in a number of specific forms.
-
Independent Factors, Thurstone (Gardner), S1...Sn
This model implies a number of relatively independent particular categories
of intelligence. Interestingly, Thurstone concluded that he was wrong and
there was a g, although he never agreed it was a determinative of specific
abilities as Spearmen did.
-
Hierarchical Models, Sternberg or modern Binet (Fig. 10-8, p. 263)
-
Information Processing Models, Guilford's cube, Kaufmans' K-ABC (Fig. 12-8,
p. 317). These analyze intelligence in terms of underlying components &
processes.
Early Roots of intelligence testing
Binet's Test
-
Originally designed to objectively select children prone to failure in
school, to provide them with more suitable educational environments
-
Basic Principles
-
Age Differentiation: Older children have greater cognitive abilities,
and can be differentiated from younger children based on intellectual performance.
Children's intellectual development can be expressed in terms of "mental
age."
-
General Mental Ability: assumes global "mental energy" level, Spearman's
g, which determines overall performance on a wide variety of tasks.
-
Individual administration, adaptive testing, verbal emphasis, arduous administration
-
Created 1905, with major revisions 1908, 1916, 1937, 1960, 1986
-
General principles remain, despite many significant alternations.
-
Some key changes include
-
vastly improved norms
-
items grouped by content instead of age scales
-
elaboration of underlying model from General Intelligence to Hierarchical
-
progression form mental age, to IQ ratio, then to deviation IQ
Characteristics of the current Stanford-Binet
-
Hierarchical model of intelligence (Fig. 10-8, p. 263)
-
Four content areas with 15 tests (Fig. 10-9, p. 263 and Table 10-3,
p.265)
-
Verbal Reasoning, Abstract/Visual Reasoning, Quantitative Reasoning, Short-Term
Memory
-
Scores: 15 tests have mean=50, SD=8; Four Content Areas & IQ have mean=100,
SD=16
-
Reliability is generally good, especially for IQ and content areas (Table
10-4, p. 267)
-
lower for subtests, younger kids (esp.) with higher IQs
-
Validity is great or lousy, depending on use/population/criteria
-
good correlations with other similar measures
-
very good predictions of gross academic functioning
-
poorer predictions of "life functioning" (wouldn't you expect that?)
-
still related to race/ethnicity/social class (what does this mean? Is it
bias, i.e. a problem with the test; is it an unpalatable truth?)
Wechsler Intelligence Scales
-
WAIS-R Designed for adults; WISC-III for children; WPPSI-R for younger
kids (4-6.5 yrs)
-
Point Scales ( points for each answer added. In contrast, Binet's test
continued until 3 out of 4 consecutive items were missed, which established
one's mental level).
-
Verbal & Performance Scales (less exclusive reliance
on verbal processes than Binet)
-
composed of 11 subscales (Table 11-1, p. 277)
-
Introduced scaled scores and deviation IQ (which the other IQ measures
now use too)
-
Subtests might reveals relative strengths, weaknesses. E.g., Wechsler was
interested in aging process and its effects on the mind.
-
Subtest patterns might have diagnostic implications (learning disabilities
are often defined as specific abilities vastly lower than the others for
that person).
-
subtests have scaled scores with mean=10, SD=3
-
Verbal, Performance, and Full-Scale IQ have mean=100, SD=15
-
Reliability & Validity are grossly comparable to Binet
Some second thoughts about our text's critique
-
Is the Hierarchical model "outdated" vis-a-vis Gardner, e.g.? We'll see.
General factors models have been promulgated before (e.g., Thurstone).
Are all these kinds of abilities meaningfully deemed intelligence? How
much independence between factors is there?
-
Should it measure below IQ=50? What does IQ mean below a percentile rank
of 1? Should it be measured with an IQ test?
-
Is it wrong to say scores<70 "reflect mental retardation?" Critics are
probably right to question diagnosis based on IQ tests alone (e.g., a deaf
person or non-native English speaker might be evaluated alternatively).
But, in your instructor's experience, those with IQs below 70 do generally
seem retarded.
-
What is the meaning of SES, racial, & cultural differences in performance?
We'll consider this next lecture, and in the next chapter. If the differences
are illusions, the tests are biased; if not, blaming the tests would be
like blaming census statistics for disparities in wealth. Is "selection
bias" a fault of tests or the rest of reality?
Other Measures of Intelligence
Special IQ Measures for Children
McCarthy Scales of Children's Abilities (MSCA)
- Lacks validity data of Binet / WPPSI-R
+ tests seem more diagnostically meaningful (e.g., 4 distinct quantitative
subtests)
+ may be especially valid for Mexican-American children
Kaufman Assessment Battery for Children (K-ABC)
+ neuropsychology emphasis consistent with zeitgeist
+ distinction between Mental Processing and Achievement Scales
+ subtests are intended to be diagnostically relevant
+ Mental Processing less related to SES and Race than standard IQ
- Mental Processing less predictive of school performance ("g" ?)
+/- Less verbal, more visual
Measures for the Handicapped
Columbia Mental Maturity Scale, 3rd ed.
For physically challenged (MS,palsy), speech or hearning impaired, poorly
coordinated, 3-12 yr olds
task: point to 1 of 3 (or 5) items that doesn't belong with the others
yields one score of "general mental ability"
- high floor (82 is lowest one can score, too high for 35% of test-takers)
- very vulnerable to random error
Peabody Picture Vocabulary Test, Revised
thought to measure "receptive vocabulary"
multiple choice, nonverbal measurement of verbal intelligence :-)
Lieter International Performance Scale
general intelligence measure using nonverbal reasoning and memory
can be used with deaf or language impaired
clinically popular, but dated with dubious norms
Learning Disabilities
typically defined by intelligence / achievement discrepancy.
Some common achievement measures:
-
Illinois Test of Psycholinguistic Abilities (ITPA)
-
Based on information processing theories in cognitive psychology.
The idea here is that problems can happen at any stage along the information
processing stages and the test tries to determine just where this process
breaks down.
-
see Table 12-4, p. 326 for description of scales
-
problems: norms, reliability, & validity poorly supported in manual
-
Woodcock-Johnson Psycho-Educational Battery
-
psychometrically sounder than ITPA (norms, reliability, validity data),
otherwise
-
Wide Range Achievement Test, Revised (WRAT-R)
-
measures achievement in reading, spelling, arithmetic
-
content validity poorly demonstrated; norms poor.
Detecting Brain Damage
-
Benton Visual Retension Test
-
Bender Visual Motor Gestalt Test
-
Memory-for -Designs Test
All have good hit rates with moderate to severe impairment, but more
trouble with subtle deficits
Group Intelligence Tests
+ Inexpensive, highly developed tests
- Low scores can be difficult to interpret (motivation, confusion re:
instructions, etc)
Kuhlmann-Anderson Test, 8th ed. (described as not heavily verbal;
still strongly related to Binet. Your instructor finds that hard to imagine.)
Henmon-Nelson Test (quick 30 min., single measure of general intelligence)
Minimizing Cultural Biases
Culture Free Tests--probably impossible, would they be meaningful?
Culture Fair Tests are a more realistic goal
-
Cognitive Abilities Test
Study the books description of this. It is an excellent example of
trying to eliminate bias.
-
measures Verbal, Quantitative, Nonverbal intelligence
-
much attention to culture/racial fairness
-
first, logically eliminated apparently biased items
-
next, statistically removed items which differentially predicted for white
& minority students
-
no change in size of black/white IQ difference!
-
Raven Progressive Matricies
-
less influenced by culture/SES/language/race than most
-
unlike Kaufman, may still be good measure of "g" (probably as good as standard
IQ tests)
-
about half as much difference between US black and white means as standard
IQ
-
still predictive of socially relevant outcomes, but less than IQ
-
do we want to minimize cultural loadings when they are important in the
"real world"?
Predictors of Academic, Vocational, Military Aptitude
College Board Scholastic Aptitude Test (SAT)
American College Test (ACT)
Graduate Record Examination Aptitude Test (GRE)
Miller Analogies Test (MAT)
Armed Services Vocational Aptitude Battery (ASVAB)
LSAT, MCAT, DATP, etc.
-
SAT is a good example of the genre
-
one of the most carefully developed, best studied tests in history
-
reliability is high, predictive validity is more modest
-
for predicting freshman GPA, rHigh School Grades = .57,
Rhigh
school grades, SAT = .65
-
thus, prediction improves from 32% to 42% of variance in freshman GPA
-
is that good enough? Could intellectual factors do better?
-
How would they be with wider ranges of predictor scores & grades?
-
Is GPA a reasonable criterion measure? What would be a better one?