RISK ASSESSMENT IN PROBATION CLASSIFICATION
CURRENT STATE OF THE ART
AGENDA FOR THE FUTURE
BY
FRANCIS M. TIMKO
B.A., ADAMS STATE COLLEGE, 1967
M.S., THE CITY COLLEGE, 1972
DISSERTATION
SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY IN THE
DEPARTMENT OF SOCIOLOGY AT FORDHAM UNIVERSITY
NEW YORK
May, 1991
TABLE OF CONTENTS
Personal Behavior Risk Analysis
Criminal Justice And Risk Analysis
Differences From Institutional
How Probation and Parole Differ
Historical Development of Parole
The Availability Of The Variables
As Related To The Specific Orientation
METHODS OF DEVISING RISK INSTRUMENTS
Overview Of The Methods Most Used
A BRIEF HISTORY OF THE MAJOR RISK INSTRUMENTS
Statistical Form And Variance Explained
HOW THE MAJOR INSTRUMENTS COMPARE
A Visual Comparison Using Charts
The Policy Problems That Developed
Increased Or Reduced Punishment?
Peace Officer vs Social Worker
Policy decisions Are Made By Others
The Major Policy Decision Of Today
Immediate Future Policy Questions
THE CURRENT STATUS OF RISK ASSESSMENT
Criminal Record Modeling Problems
Weaknesses Of The Current Instruments
UNRESOLVED ISSUES AND PROBLEMS
Will The Stakes Of Failure Be Added?
Will Theoretical Instruments Be Developed?
SUGGESTIONS FOR FUTURE STUDIES AND POLICIES
GRAPHICS FIGURE BOXES
Figure 1 Age vs Percent Non Violators
Source: The Workings of the Indeterminate Sentencing Law and Parole System
in Illinois, 1928 Table II pg. 265.
Figure 2 NYS Probation Caseload Growth vs state prisoners since 1925
Source: The probation data was obtained from NYS DOPCA. The corrections
data was obtained from Historical Statistics of prisoners U.S Department of
Justice 1988.
Figure 3 A theoretical 50% failure rate.
Figure 4 A theoretical 33% failure rate
Figure 5 A theoretical 25% failure rate
Figure 6 A theoretical 9% failure rate
Figure 15 The Salient Factor Score 1970 Construction Sample
Source: Hoffman and Beck 1976:71.
Figure 16 The Salient Factor Score 1970 Validation Sample
Source: Hoffman and Beck 1976:71
Figure 17 The Salient Factor Score 1972 Validation.
Source: Hoffman and Beck 1976:71.
Figure 19 The Salient Factor Score Sample I of the 1981 Revision
Source: Hoffman and Beck 1983: 541.
Figure 20 The Salient Factor Score Five Year Follow-Up Study
Source: Hoffman and Beck 1985: 505.
Figure 21 National State Prisoner Population Growth from 1925 to 1986
Source: Historical Statistics on Prisoners: 5-13.
Figure 22 The Three Prongs of the Wisconsin Management Information System
TABLE BOXES
TEXT BOXES
1Predictions and conditions which are true and false
2Conditions and Predictions as related to True and False Positives and Negatives
3Criterion of Failure and Weights Added.
Source: Project Report 2, pg. 2
5Criterion by Frequency of Occurrence
Source: Clear 1988: 18.
8 The Types of Crimes - As grouped in regard to their relationship to the victim
9 The organization of the criminal history variable by the depth of analysis
10 Hypothetical history of John A and John B
On a personal level, I wish to thank my wife, Christine, and my five children, for their tolerance in this long endeavor. Special thanks are extended to my friend John Lucashuk for his constant technical help in creating a very effective word processing engine for this work.
On a scholarly level, a special thank you is extended to Professor Gerald Shattuck for mentoring this work and to Professors Doyle McCarthy and Michael Cuneo for serving on this committee. Their assistance, encouragement and direction were invaluable. I also wish to thank Professor Peter Sissons. He originally helped guide me through the early stages but could no longer serve on the committee because he left this country.
On a Professional level, I wish to thank all those of the community corrections field who supplied documents and technical assistance in this work. I also wish to thank the originators of the major instruments. Christopher S. Baird was very helpful in our long interview. Professor Gottfredson was so encouraging and helpful. Brian Bemus was always so open and encouraging during our many conversations and meetings. The massive amounts of documents he supplied were essential to this work.
This is a study of the origins, development and status of Risk Assessment in Probation. It is used to determine the threat level of an offender. It is a branch of risk analysis that is not firmly connected to the larger general body. It is also not the same as the assessment of risk in parole, even though they both share community corrections. There is a larger general field of risk assessment. It is involved in areas such as battle plans, complex power plants, drug approval and large technological systems. The general field began in the 1930's with the belief that it was scholarly, pure and free of bias. As the general field matured, it learned that risk assessment was not free of bias, and the political dimension. These shortcomings have not been realized in the probation component.
In the early 1920's, parole instruments pointed the way to classification by instruments in community corrections. Parole instruments then began to gain acceptance in the 1960's. Probation instruments, once they fulfilled managerial purposes, proliferated in great numbers in the 1970's. It was then believed that probation instruments were transportable to different populations, accurate and need not be theoretically based. The time has come to examine the claims that the probation instruments are impartial and accurate classification instruments. It is time to judge their current performance and shortcomings. It is time to visualize the form, content and structure of the next generation of instruments.
The general field of risk is useful. Government agencies constantly make decisions that affect citizen health, safety, quality of life, and the very continuance of life. Policy decisions or budget constraints usually force these issues. The consequences of such decisions are either very direct or greatly removed from the topic.
Good decisions are made wisely. They remain impartial to special interests or acknowledge such a relationship. They must be free of personal values but represent the values of the body politic. The costs of the decision, must be understood. The decision process itself must be understandable to the larger group of individuals beyond the experts that framed it. Snap executive decisions will not suffice, when the problem becomes complex. Large technological systems or a complex battle plan need risk assessment.
Short remarks that the, "relationship of scientific institutions to
the federal government has been crucial"
to the development of the
general field of risk analysis and that its beginnings like so many
other similar arrangements began in the depression days of the
1930's. The beginnings were the Science Advisory Board and later
the Office of Scientific Research and Development.
Later statistical
techniques were applied in World War II and this led to terms such
as, "operations research" and later, "systems analysis"; according to
Short. Finally a massive decision industry was created with the
federal government as the major customer.
Systems Analysis then
developed, after that point.
Later, technological risk entered the picture as a result of nuclear power plants, the concern to protect workers, consumers and the environment. This led to the creation of four new national
If we look at Starr, Rudman and Whipples's article, in the
Annual Review of Energy almost a decade ago, we see risk being
defined as, "the probability per unit time of a cost burden (injury)
occurring ... and the magnitude of damage."
They also discuss
differences between voluntary and involuntary risks and concepts
such as real risk, statistical risk and perceived risk. They also state
that injury can occur to the structure and culture of a society as well
as individuals. The tone of the article however, is very scientific and,
"value free." Specifically in this regard, they note that, "The use of a
societal value system for risk acceptance must rank as the major
unresolved issue in this part of the decision process."
If we look toward the early part of the 1980's and examine the
book Acceptable Risk, we see an expansion of the decision
problem. The authors comment that acceptable risk is different from
other decision problems. "At least one alternative option includes a
threat to life or health among its consequences."
The authors also
note, in contrast to the pure science approach, that, "choosing an
approach is a political act that carries a distinct message about who
should rule and what should matter."
Later with Rescher we see a further modification. He defines
risk in the broadest terms of, "the chancing of negativity - of some
loss or harm."
He also expands risk to include loss of privilege,
harm to valued persons or causes, and loss of potential benefit, etc.
The point is also made that we, "only have perceived risks of real
risks that we are aware of."
Rescher also counters the earlier
assumptions that various negatives can be equated on some
common scale. In this regard he notes, that there is really no
common scale in man days lost to equate a large number of people
being injured to a few being killed.
Rescher also makes the
strongest statements regarding the subjective nature of judging
risks. He notes that the, "size or magnitudes" of risks do not
previously exist but they come about from a judgmental decision.
Specifically he notes that the, "size or magnitude is not something a
negativity has, its something it gets."
For individuals, he notes, it is
largely a matter of personality type; for a group choice however it
assumes a political dimension. Specifically it, "is a matter of the
value - system of the group as made manifest through its collective
machinery of decision - its political machinery."
In 1984 the American Sociological Association's Presidential
Address was devoted to Risk Analysis. In this text James Short
clearly states that sociologists can make contributions in the area of
Risk Analysis. Previously noted was his statement that, "Risk
Analysis recognizes judgments regarding safety ... as normative and
therefore political."
He also goes on to say that it is rarely
recognized that even the consideration of the problematic is
normative. Concerning the specific areas of contribution by
sociologists he notes that social and cultural rationality
guides most
individual decision making and that Social Rationality is, "embedded
in social and cultural values."
He also notes that Douglas's call for
cultural analysis of the perception of risk
is also on the research
agenda and that, "such a theoretical shift would modify the
entrenched ideas that facts can be separated from values, and
nature from culture."
Clark also notes
that Risk Assessment In the societal sense is
not that tidy.
Specifically, "Societal relevant risk is not uncertainty of
outcome, or violence of event, or toxicity of substance, or anything of
the sort. Rather, it is a perceived inability to cope satisfactorily with
the world around us."
He further states that risk management is not
specifically a science at all.
Risk management lies in the realm of trans-science, of ill-structured problems, of messes. In analyzing risk messes, the central need is to evaluate, order, and structure the incomplete and conflicting knowledge so that the management acts can be chosen with the best possible understanding of current knowledge, its limitations, and its implications. This requires an undertaking in policy analysis, rather than science.
Clark comments that witch hunting could be considered an
early form of risk assessment.
Long ago the problem was that
some wheat was dying in the fields, some sheep were dying of
unknown causes, some crops were struck with unseasonable frost
and human diseases were on the rise. Witchcraft was blamed. At
first the church ignored it, but with the publication of Sprenger's
Malleus Maleficarun social policy was crystallized and the social
structure reacted.
Clark notes that similar watersheds have
occurred in modern times with the publication of Kefauver's hearings
and its impact on drug regulations and Carson's Silent Spring which
impacted the environment.
Clark states that Malleus Maleficarun proved, for the people of
that day, that witches did exist and that the society was in peril. Any
action in the name of the common good was thus justified. Pope
Innocent VIII accepted this argument and gave official authorization
of even torture, so the witches could be eliminated.
Eventually five
hundred thousand individuals were incinerated in the witch hunt and
it continued for over two hundred years!
Clark makes the point that the witch hunts were not too dissimilar from the risk hunts of today. The more the inquisitors looked, the more witches they found. If someone denied being a witch then either torture forced the truth or they died in the process. Today if the experimental rats can function with the most massive doses of some chemical that can be put in their food, then force feed them or use mice. Clark notes that in both cases a lack of limits has the same effect. The hunter will prove his point, by whatever means. Just as those that denied witches were to blame and were accused of being in league with the devil, so some individuals are falsely accused when they speak against the rigor of the methods of today. The witch hunts slowed dramatically when torture was finally banned, when property could no longer be taken by the church and when the charges had to be corroborated.
Witch hunting in that day was a way to make a name for one's self and move ahead, just as some risk assessment is today. Clark further notes that none of the clergy or aristocracy was accused. The hunt moved the focus of the common man's problems from being victimized by the social structures of the day to being saved by the social structures of the day. Clark also notes a parallel of today. Science has been under attack in modern society but now the scientific professionals of risk assessment will rescue us from our newfound perils.
Many writers comment that the nature of risk assessment is,
that not all the variables and facts are known. Not knowing all the
factors makes it very difficult to predict the future effects of the risk
reduction procedure. Thus they note the previous problem of flood
control. Clark specifically notes that, in the middle of this century
much human effort was devoted to flood control. Finally a debate
raged among the involved parties
and Clark notes that at first the
facts were, "explained away" or, "denied" but finally the evidence
mounted. This risk reduction method of eliminating the threat to
human life and property by building dams had ignored the reaction
of the people. At first when a river flooded the loss was small
because few individuals inhabited the flood plain. When the risk of
periodic flooding was eliminated by the dam, the people moved to
the now safe flood plain. Now however the risk became the failure
of the saving methodology. For if the dam should fail the loss of life
and property would be catastrophic. Thus the method used to
protect humans from frequent random events (which resulted in a
small loss of life and property) if it failed would result in huge losses
of life and property. In essence the risk reduction method had
become an even bigger threat, than the original risk.
Personal Behavior Risk Analysis
If we move from the area of Risk Analysis in general, to its
impact within the criminal justice system we also see questions
being raised. Shah's article
in 1978 asked questions such as, why
does society treat the mentally ill as the most dangerous group when
chronic repeat offenders and drunken drivers are demonstrably
more dangerous? He also notes that even though a trait of an
individual is designated as dangerous the whole person becomes
viewed and labeled as being dangerous. Shah also raises the "false
positive" issue, in which many mental health professionals would
rather be safe than sorry in designating someone as dangerous. He
views the primary reason for this, as the community's reaction to an
error of releasing a dangerous person but showing little concern
about errors leading to unnecessary confinement.
Henry Steadman did a series of articles in the early 1980's,
relating the difficulties in trying to predict assaultive behavior in the
mental health system. In the first article he makes the point that
neither psychiatrists nor other clinicians can predict who will be
dangerous at any higher rate than by chance probabilities.
He also
noted in his review of the previous research that the two best
variables for predicting future assaults were the individual's age and
a numerical quotient derived from an analysis of the individuals prior
legal history (LDS Scale). Steadman further cited one researcher (Koppin) as employing a
wide range of social history, psychiatric measures, and the LDS
scale to obtain statistically significant results.
He further noted that
the accuracy rates were very close to the 30% base rate of the
sample.
Shah, who was mentioned previously, also raised the base
rate problem. His contention was that if 10% of a group were
expected to demonstrate certain behavior on the basis of prior
probabilities, which is the base rate, and the evidence of future
probabilities on the individual level is of poor reliability then the
predictions should remain close to the base rate of 10%. The more
that they move away from the base rate, then the greater the degree
of error.
In the second article
, Steadman and Morrissey, again use the
data base concerning indicted and unindicted individuals. However,
they also point out and compare two divergent views in prediction.
In the public protectionist view, the base rate is assumed to be much
higher then the actual rate and there is an attempt to reduce the
false negative rate at the expense of the false positive rate. The
assumption, in this strategy, is that it is better to label people as
dangerous than to run the risk of not labeling them dangerous. In
the Civil Libertarian perspective, predictions are more in line with the
base rate and there is no attempt to reduce the false negatives at
the cost of the false positives.
In this study
, essentially, two variables in the indicted sample
were shown to explain 13.5% of the variance. These included, when
the first psychiatric hospitalization occurred at a younger age and
when two prior arrests for violent crimes had occurred. They also
noted that there must be a policy issue of balancing one type of
consistent error (false positive or negative) against the other. To
raise accuracy to 70-80% demands that a large number of false
negatives are accepted as a matter of course.
In Steadman's next article he suggested that assaultive
behavior is setting specific. He also suggests that no single
equation can adequately predict both hospital and community
assaultiveness together.
While the derived equation correctly
classified the indicted felony group 63% of the time in the hospital
setting and 82% in the community, problems occurred in the total
application. The prediction for indicted felons represented a 32%
improvement regarding community assaultiveness; however, when
applied to the unindicted group, the in community improvement over
chance was actually 21% worse then by chance. When applied to
the involuntary civil patients, the equation correctly identified 72% for
the hospital setting and 80% for the community setting but the error
rates were very large. For those in the hospital setting, for example,
the false positive rate was 90%.
These results led Steadman to make the statement, that
although statistical prediction compared to clinical prediction was
superior variables such as social demographic, criminal history and
mental hospitalization are, "of little practical value in accurately
assessing future violent behavior.
"
In Steadman's 1982 article he notes that we have, "inabilities to
make accurate predictions of future violent behavior"
and that
initiatives should be centered in the area of, "new conceptualization
of relevant variables."
He then calls for the realization that behavior
is the result of both personality characteristics and characteristics of
the environment. Steadman thus calls for a focus on the situational
context of the behavior, for such information is currently lacking in
the conventional data base of the person.
It is interesting to note what some psychiatrists were doing after
Steadman's articles were published. In an interesting article in
Hospital and Community Psychiatry in 1983, the authors try to
address when psychiatrists are liable for their decisions and some
important factors to be considered. Most relevant to this review is
the statement, by the psychiatrists, that, "Risk evaluation is a
sociopolitical process that involves an individual's ethical and social
value judgments."
It is also important to note that their decision
table includes high risk factors such as environmental factors.
The
table in their article serves more as red flag concepts, for possible
release, than as a risk assessment instrument. Specifically noted
were that mathematical probabilities cannot be assigned to the items
or groups of items because they were not known.
Criminal Justice And Risk Analysis
In the preceding review of selected Risk Assessment literature it maybe helpful to now look at the research in the criminal justice area. This work has focused on the identification of repeat offenders. Previous studies have shown that juvenile court involvement is indicative of future criminal court involvement and that within the pool of all offenders, a small number commits a large number of crimes. The difficulty is to acceptably identify those individuals.
McKay notes
in a study of juvenile court records that the
number of Juvenile Court involvements is indicative of future criminal
court involvements. In this study, a one third sample of boys with at
least one juvenile court record in 1920 Chicago was used. Of those
processed in the juvenile court with one petition, 52% where later
arrested as adults. Of those processed with two petitions, 67.4%
were later arrested as adults. Of those processed with three
petitions; 73.9% were later arrested as adults.
The work by Wolfgang, Figlio and Sellin
has been widely
quoted over the years. In that study a cohort of 9,945 boys who
were born in Philadelphia in 1945 and who lived there between the
ages of 10 and 18 were tracked for offenses. For example, Empey
notes
that 35% percent had at least one police contact but 65% had
none. The boys that had committed at least one offense were
responsible for 16% of the total offenses. The boys that had over
five or more police contacts (N=627) however were responsible for
over half of all the police contacts, even though they comprised only
six percent of the group.
Silberman
notes the same numbers but also adds that, "This
dichotomy between a relatively small number of chronic criminals
and a far larger number of occasional offenders appears to be
characteristic of adults as well as juveniles."
He notes that the
FBI's computer file for the years 1970-74 contained information on
208,000 offenders arrested during that period. These offenders
were responsible for 830,992 arrests during their careers. Of this
group 35% were only arrested once and they accounted for less
than 9% of the total arrests. "The same proportion had been
arrested four times or more; averaging 8.2 arrests each, these hard-core criminals accounted for nearly 75 percent of all the arrests."
Petersilia and others conducted a study
of 49 individuals who
were serving time for armed robbery in a medium security prison in
California. The study method involved the use of not only official
arrest records but also information on the crimes committed from a
self reported questionnaire. This method was used because, "fewer
than 20 percent of all major crimes result in arrest."
While the
authors noted that it was a small sample and totally confined to
armed robbers, the sample was responsible for a rather large
number of crimes over their criminal careers. Considering only 9
major types of crimes
the individuals were responsible for 10,505
offenses.
They also noted that the number of offenses committed
by these individuals declined with age.
During the juvenile period
they averaged 3.2 serious crimes per month but only 1.5 per month
in the young adult period. In the adult period this further declined to
0.6 crimes per month.
In 1984 Chaiken and Chaiken described a technique in which
offenders were classified according to the types of crimes they
committed. Their focus was the accurate identification of the,
"omnifelons," the violent predators in our society. In this interesting
article again the problem of false positives arose and the authors
noted that, "there is no simple straightforward way to identify
robber-assaulter-dealers from the data in their official records - as
those data are currently collected."
If we move on to the Rand study of 1985, concerning felon
probationers, we again see the promising technique of grouping
offenders by offense type. In this attempt at prediction, by grouping
offenders (drug, property, personal, all others), a tremendous wealth
of individual information was used in conjunction with the best
statistical models. This technique yielded a prediction rate of 71%
accuracy and it was also noted that the false positive rate was,
"substantially lower than those reported in most previous
research..."
The authors also noted however, that "although the
analyses revealed strong associations between those factors and
recidivism, our predictions using them were not very accurate."
The
listed by chance rate for reconviction's for a violent crime was 68%.
In regard to the characteristic false positive and negative issue again
characteristically they were high.
The false positive rate for
reconviction's of a violent crime was 68% with a false negative rate
of 16%.
Operational and policy decisions are also apparent in our community supervision systems today. There are many probationers and parolees convicted of many different crimes and only finite resources to monitor them. Choices regarding the amount of time or emphasis to be allotted to each offender must be made in some rational observable manner. The broad title for this process within community supervision has become to be known as classification.
Classification results from a need for order and can result from administrative needs or criticisms from outside the system. The administrative needs obviously reflect a desire to standardize the completion of work performed so output can be measured and quality control checks can be implemented. Criticisms from outside the unit of government include the call for quality control, measurement of work output, and especially the remedy of inconsistent decisions on the part of a given unit of government. This last complaint, of inconsistent decisions, usually revolves around consistency with race as a central factor. Formal classification systems usually remove discretion that can be racially biased. Classification is thus thought to be the result of administrative needs and minority concerns that have arisen over time. Classification procedures can also be adopted however in a, "trendy" manner. As such they are not adopted to correct a problem but merely to create the impression that the system is keeping pace with or conforming to the rest of the world.
Within the criminal justice system, classification occurs at all levels. It occurs when the pre-trial service determines a good ROR (release on recognizance) prospect. It occurs when correction determines suitable lodgings for a pre-trial detainee and when probation makes a pre-sentence recommendation for no penalties imposed, probation or incarceration. It also occurs when the judge sentences the offenders to various sentences. If incarceration occurs, correction classifies the offender regarding suitable lodgings and programs. Eventually if a long period of incarceration occurs, the offender must be classified regarding the suitability of parole. If probation is the sentence, then the probation system classifies the offender regarding various, programs, contact levels and regarding expectations of probation failure or success. Classification thus is a prime component within sentencing and corrections; presumably to bring about more consistent, fairer justice.
Classification helps the system accomplish its mission and systems usually do not implement it unless that is in jeopardy. Classification can thus be viewed primarily as a survival mechanism. Within community corrections, departments frequently adopt it when the numbers of cases per officer become unbearably high and the probation officer and probationer contact levels become very low. At that point, it could be questioned whether the system is accomplishing anything. It can also be viewed as a trend. Systems have adopted it because it is a sign of a large progressive department. Classification is also a way to formalize the operational policy of a department into a set of procedures. It thus is a representation of policy and mission in guideline form. Classification's main benefit comes within the area of management. It allows the formation of uniform workloads; uniform procedures; and it eases case assignment to name but a few. The most shining attribute however is as a weapon to defend the department's budget by justifying the department's existence or expansion. It does this by demonstrating good management and by forecasting increased community dysfunction if the current level of funding is not maintained. It is thus also a massive proactive budget tool. Prior to the formalized classification systems, classification occurred at the individual officer's discretion or was based upon the offender's time in the system. Classification could thus occur without the need for the formalized systems outlined above but it would not meet the management and department survival needs outlined above.
Classification has been used, for a long period of time, to place inmates within ranges of proper institutional confinement. A range of behavioral control was specified with the classification category. Indeed it could also be said that by classifying an offender to be removed from the community a reduction in future community crimes was achieved. They were not in the community to commit them. Institutions have become overcrowded however and institutional space is at a premium. A new development has been to use the institutional classification process in a selective manner, to achieve the result of reducing crime in the community.
Greenwood has been at the center of this attempt to use
incapacitation, which results from institutional confinement,
selectively to reduce the crime rate.
He notes
that three basic
methods have been used to sentence individuals to incarceration. In
deterrence theory by increasing the probability of apprehension or
the severity of the sanction a reduction in the crime rate was
assumed. In the "just deserts" model, an offender is sentenced in a
manner proportionate the severity of the offense, only the offense
matters and not the individual characteristics of the offender. In the
incapacitation model offenders were sentenced with the belief that
by removing them from the community, a reduction in future crimes
committed by them would be achieved.
Greenwood notes that the attempts to identify offenders' needs
and to structure a rehabilitation program to give them the skills
needed or the psychic disposition needed to adequately function in
society, have not met with success. Such attempts have been,
"consistently discredited by critical evaluations that have found
rehabilitation to be an elusive goal."
He also notes that traditionally
two basic methods have been used to determine high risk offenders.
A subjective approach has been used to determine psychic
characteristics; past behavior; and life skill needs; or an actuarial
approach has been used. Greenwood notes that the actuarial
method, based on statistical methods, has consistently been judged
superior in prediction.
Selective incapacitation, Greenwood notes, then could be
viewed as a specific method within the incapacitation model. It
"attempts to use objective actuarial evidence to improve the ability of
the current system to identify and confine offenders who represent
the most serious risk to the community."
This method however can
only be effective if a small number of offenders within the larger pool
of offenders committed a disproportionate amount of crimes. If the
average offenders crime rate was consistent throughout the
spectrum, then selective incapacitation would have no greater
impact then the previous model of plain incapacitation. While
Greenwood doesn't go into specifics, of previous research, he does
mention that there, "are even fewer who continue to commit crimes
over an extended period of time (Wolfgang, Figlio, and Sellin,
1972)."
He notes that it is this group which is the focus of criminal
career research. Greenwood's study also showed that only a small
portion of the offenders were high rate offenders. Specifically he
notes
The distribution of individual offense rates was heavily skewed toward the high end...For instance, among all offenders reporting the commission of robberies, 50 percent committed fewer than 5 per year. But 10 percent committed more than 87 per year. Among active burglars, 50 percent committed fewer than 6 per year, while 10 percent committed more than 230 per year.
The attempt then was to identify the high rate offenders by the
actuarial method. No specific theoretical method however was
specified for the selection of the appropriate variables. He merely
mentions that previous studies have identified items that correlate
well and traditionally these items have been used for sentencing
.
Items were then selected for their predictive ability and then some
items were removed because of their controversial nature.
The resulting seven item scale, that had equal weighing, was:
1. Incarcerated more than half of the two-year period
preceding the most recent arrest.
2. A prior conviction for the crime type that is being
predicted.
3. Juvenile conviction prior to age 16.
4. Commitment to a state or federal juvenile facility.
5. Heroin or barbiturate use in the two-year period
preceding the current arrest.
6. Heroin or barbiturate use as a juvenile.
7. Employed less than half of the two-year period
preceding the current arrest.
Greenwood notes that the scale is superior to methods which focus
on the criminal record of the offender, because just as in any
occupation some are successful and some are not. Some are highly
proficient and some are less so. Also the probability of
apprehension is low. He specifically notes that in, "California, the
probability of arrest and conviction (q) computed from official data for
either robbery or burglary is .03--three chances out of 100."
Greenwood notes that the primary argument against selective
incapacitation is the moral and ethical question, because of the false
positive issue. He thus acknowledges that some offenders that were
not high rate offenders would be selectively incapacitated. No
mention however occurs regarding the actual amount of such
improper sentencing. In a later work, by Greenwood and Turner
,
however it is noted that, "Nearly 30 percent of the overall variance is
explained by these items." Thus if 30% is explained, the implication
is that 70% of the variance is unexplained. The authors thus note
It is clear that substantial differences in sentence lengths for the chronic offenders studied here cannot currently be justified on selective incapacitation grounds alone, because there are no reliable methods for either measuring or predicting future offense rates. Furthermore, the development of reliable offense-rate prediction models is hindered by (1) the methodological problems encountered in attempting to obtain accurate information about individual offense rates directly, through interviews or observations, and (2) the apparent weak correlation between individual offense rates and the most frequent used substitute measure for them, individual rates of arrest.
Differences From Institutional
Within Community Corrections, the impetus to develop classification systems had different starting points and was a reaction to different needs. In the correctional setting, in theory, the behavior of the inmate can be monitored and controlled at all times. The inmate is confined in a limited space; and all his needs, privileges and behavior are given out, met and controlled; by the system.
In the community the behavior of the offender cannot be monitored and controlled at all times. A representative of the system cannot be with him at all times, thus he cannot be monitored at all times. The individual is also fending for himself in the community because his basic needs, privileges and rewards are not given by the system. They thus cannot be directly controlled by the system. Community corrections thus can never equal the control of the institutional corrections system.
The more choices allowed the offender, beyond the systems control, the greater the chance for responses that the system will not approve of. The potential of criminal activity in solitary confinement is minimal. For an offender living alone in the community the potential for criminal behavior is greater.
How Probation and Parole Differ
Parole selection instruments are used to make an "in/out" decision with the system. They determine whether someone should remain incarcerated within the correctional system and be under utmost scrutiny; or whether they can be allowed to enter the community. The community corrections component called parole, offers only limited scrutiny and only limited control over the offenders' behavior. The choice is thus between two major systems of control, whereas probation instruments only differentiate or choose between ranges of control within the community. Parole decisions thus make a choice that is more extreme in nature.
The parole decision is also open to more criticism. Probation is a judicial decision and parole is an administrative decision. In the case of probation, all criminal court judges can make the "In/Out" decision. They determine whether the individual is to be incarcerated or release to the community on probation. Any criticism is thus diffused among the judiciary. The parole decision is made entirely by a Parole Board and it must solely bear the burden of criticism if a poor choice is made. Risk of failure thus is more of a burden and more important in the parole decision. It is therefore more important in the parole instruments.
The parole decision however is also driven by different pressures from the probation system. Parole by releasing offenders from the institutional setting can thereby act as a method of limiting the institutional offender population. Parole thus can act as a valve to limit the institutional offender population. This gives the institutional corrections component control of its inmate population, that is independent of the judicial branch of government. Probation systems however cannot control their population levels without the judiciary. They can only limit the degree of supervision and limit the number of offenders by going back to the judiciary. The probation method of limiting the probation system load is to seek early discharges from probation or to seek the termination of probation by returning the offender to the judiciary requesting a revocation of probation. In general these are extreme uses of normal methods and they are ineffective in limiting the volume from the judiciary.
The ability of corrections to control its offender population however carries with it more responsibility and potential criticism. It is also driven by different pressures. The institutional component is not as elastic as community corrections. The degradation of system performance in community corrections from one officer supervising one offender to one officer supervising a thousand offenders is not abrupt but ranges along a continuum. It is also dependent upon the type of offenders being supervised. In the institutional setting however, problems start when the system cannot supply the basic needs of even one inmate. Even individuals who would pose a small threat in community become difficult when the system cannot feed them, supply a place to sleep, or provide a space for exercise.
Historical Development of Parole
It is important to know that parole and probation had different
starting points. O'Leary notes
that the origins of parole can be
traced to its Australian and Irish origins in the early 1800's. Captain
Alexander Maconochie, in 1840, established a unique penal system,
on the island of Norfolk off the coast of Australia. Prisoners were not
confined for a set amount of time but only until they had completed
certain tasks. Prisoners could earn a maximum of ten marks per
day depending upon their performance. From his total mark's,
deductions were made for food and various other services provided.
In theory if the inmate selected the lowest types of food, performed
satisfactorily and drew no other supplies he could earn up to seven
marks per day. At various total levels, "his conditions of confinement
were made less severe"
and then he was conditionally released.
After his release, if his negative behavior was a problem, the
sentence was reimposed. When Maconochie returned to England in
1844 he wrote on penal reform and was active in the movement. Sir
Walter Crofton, who had been a disciple of Maconochie, in 1846
introduced the mark system into the Irish prisons, with a further
modification. Now the released prisoner was subject also to
conditions of behavior after their release and they had to report
periodically to the police.
Most texts agree that the beginning of parole in the United
States occurred in New York State with the Elmira Reformatory in
1869. At that time the indeterminate sentence was shortened by
good behavior and then the individual was released to a, "6 month
parole term during which the parolee had to report regularly to a
volunteer guardian or sponsor."
In 1884 the state of Ohio
introduced the same principles into its state prisons.
It is of interest to note that New York State, which is credited
with the first implementation of parole in America, waited until 1889
to partially introduce the concept into the adult system. At that time
the law was revised
to allow the sentencing court to impose a
minimum and maximum term of imprisonment within the prescribed
term. It has been noted however, that, "between 1889 and 1901 the
courts imposed indeterminate sentences in only 115 of 13,000
cases."
Probably as a reaction to the low utilization, the laws in
1901 were again revised to allow parole for first time offenders who
were serving a sentence of five years or less and who had
completed at least one third of their sentence.
Control of who was
released then passed from the Judiciary to the Parole System. In
1907 the laws were again revised to make, "indeterminate
sentences mandatory for persons convicted of felonies for the first
time."
In 1936, after the parole system was moved to the Executive
Department, the laws were again revised to make, "indeterminate
sentences mandatory for all crimes punishable by imprisonment for
terms less than life."
The question of who developed the first parole system has not
been agreed upon. Even in 1926, The Honorable Hinton G.
Clabaugh, who was the Chairman of the Parole Board of Illinois,
stated that, "Illinois is said to be one of the first states, if not the first
to enact a parole law."
The problem arises from the date when
parole was introduced into the adult system. Illinois had introduced
parole for misdemeanors in 1895 and for all felonies, with the
exception of treason and murder, in 1897.
Parole was concerned with classifying inmates regarding their failure potential if released early from prison. This form of classification not only reduces concerns regarding the equity of the selection but also reduces the uncertainty regarding criminal behavior while on parole. Parole classification instruments frequently determine whether the offender should remain incarcerated or could be released on parole. Probation instruments did not exist at this time. Probation instruments, when they developed were only concerned with determining levels of supervision once community corrections has been chosen.
Lejins notes
that the start of parole prediction occurred with
Professor Sam Bass Warner in 1923. In this initial attempt he
related characteristics available from reformatory records and
success and failure on parole with individuals released from a
reformatory. Only a limited relationship was noted. Shortly
thereafter Professor Hornell Hart suggested that an improved
method would show a relationship by combining several items into a
total score. Lejins thus credits Hart as the, "originator of the parole
prediction idea."
The first large scale study relating individual factors of the
offender and parole outcome however was accomplished by
Professor Ernest W. Burgess, a sociologist. The focus of this
internal study
was actually to review the Illinois Indeterminate
sentencing law. Arrangements had been made for an unbiased
evaluation, with the University of Chicago, the University of Illinois
and Northwestern University, "to undertake an examination, analysis
and report on the entire record of the Parole Board for not less than
six year's past."
Three eminent men were selected for this
purpose
and they formed the Committee on the Study of the
Workings of the Indeterminate Sentence Law and of Parole in the
State of Illinois. The research included in the report, which involved
three thousand men from the three major institutions, resulted in the
development of base expectancy rates for parole violation and the
implementation of an instrument into actual use. Burgess is thus
credited with the initial practical application of the concept.
The method consisted of examining the parole population,
which had a known violation rate. Then the violation rate for specific
sub-populations was computed and if it was lower or higher than the
total population violation rate, it was deemed relevant. Initially
twenty-two factors were proposed and studied to determine their
impact on parole violations. These factors were
(1) The nature of the offense
(2) Number of codefendants
(3) Fathers nationality
(4) Parental status (including broken homes)
(5) Marital status
(6) Type of criminal (first time offender etc.)
(7) Social type (hobo, gangster, etc.)
(8) County from which committed
(9) Size of his community
(10) Type of neighborhood
(11) Resident or transient when arrested
(12) Court recommended or denied leniency
(13) Sentence by plea bargain
(14) Nature and length of sentence
(15) Time served before parole (in months)
(16) Previous criminal record
(17) Previous work record
(18) Institution punishment record
(19) Age at parole
(20) Mental age
(21) Personality type
(22) Psychiatric prognosis
Burgess actually found a number of factors that correlated with an individual's success or failure on parole. These factors which indicated a correlation, as opposed to the twenty factors originally studied, included
(1) The general type of offense
(2) Parental and marital status (ie. broken home)
(3) Type of criminal (first offender etc.)
(4) Social type of the offender (hobo, gangster etc.)
(5) Community factors (transient, farmland, etc.)
(6) Court recommended or denied leniency
(7) Time served before parole (in months)
(8) Previous criminal record (probation only, school,
penitentiary only, jail only etc.)
(9) Previous work record (none, casual labor,
irregular or regular work)
(10) Institution punishment record (solitary
confinement or demerits or demotions)
(11) Age at parole
(12) Intelligence rating
It should be noted that his tables did not indicate a linear relationship in many the above factors. For example, intelligence as a factor of parole success would include a good deal of variation depending on the institution. Th
Pontiac Joliet Menard
Very Inferior 75.7 78.7 75.0
Inferior 85.3 76.6 72.9
Low average 77.6 68.6 76.8
Average 82.9 68.0 76.5
High Average 80.2 75.9 60.0
Superior 73.2 83.3 65.2
Very Superior 90.5 76.2 60.0
Table I Intelligence as related to Percent Non ViolatorsSource: The Workings of the Indeterminate Sentencing Law and Parole System in Illinois, 1928 Table II pg. 265.

Figure 1 Age vs Percent Non Violators
Source: The Workings of the Indeterminate Sentencing Law and Parole System in Illinois, 1928 Table II pg. 265.
It should also
be noted that age
also did not possess
a linear relationship.
It has a curvilinear
relationship. This
relationship can be
seen in the
accompanying chart,
that has been
assembled from his data. The highest percentage of non violators
occurs in the less than 21 year old group. The rate then decreases
to the lowest point of the 30 to 39 year old group but then rises
again. Lejins notes that the report by Sheldon and Eleanor Glueck
, in
1930, set the stage for a new change in thought. The new thought
was the inclusion of favorable background factors that were
weighted on the basis of their relationship to success and failure on
parole. This, of course, is precisely the method used today.
Later Ohlin brought forth the idea, after others, that the
experience tables should be adjusted continually. He suggested
that they should even be adjusted yearly, based upon the current
experience of the current parolees. Glaser notes
that the there was an accelerated rate of
acceptance, for statistical tables during the 1970's and 1980's. He
cites two main reasons for the effect. The tables were now
projected as advisory information and the end users were involved in
the design. He notes that the beginning of this change occurred
circa 1960, with two researchers for the California Department of
Corrections.
Later the National Council on Crime and Delinquency brought
together virtually all the parole boards to examine their predictions
with hypothetical cases and statistical tables. Glaser further notes
that an outgrowth of this process was the Uniform Parole Reports
Program.
The major turning point, according to Glaser, was the
development of the federal parole guidelines. After their
development many other scales were developed in many different
areas.
It is important to note that probation developed differently than
parole. Old texts noting the historical roots of probation usually cite
probation as flowing from the volunteer efforts of John Augustus. It
is of interest to note that in 1928 the three eminent men
who
completed the voluminous study, entitled The Workings Of The
Indeterminate-Sentence Law and the Parole System In Illinois, make
no mention of the legend of John Augustus. They note
In using the terms probation and parole, it should be remembered that in the legal sense these words are not synonymous, although they are usually thought of as one and the same. Probation is conditional release (by the court) after conviction with no time served and is, in fact, a substitute for commitment to a jail or penal institution. Probation developed from the English practice of deferring judgment and sentence. Originally probation was limited to misdemeanors and it certainly was not expected that courts would apply it in thousands of cases of armed-robbery, burglaries, rape, manslaughter, and crimes of violence.
The legend of John Augustus is normally recounted as
One day, in 1849, a cobbler, John Augustus, appeared before a Boston judge and asked that a young man charged with drunkenness be placed in his care, promising the court that he would be responsible for the young man's conduct. The request was granted and probation was, for the first time, officially recognized by the courts.
More recent works
have ignored the legend and simply stated
that the, "practice of probation appears to have arisen as a technical
means of avoiding the full impact of conviction."
The problem
occurred because the law and the prescribed penalties were quite
rigid and no mechanism existed to modify the sentence after it had
been imposed.
"They suspended proceedings between conviction and sentence,
creating an opportunity for the parties to explore other remedies to
prevent injustice being done by fixing the sentence."
Later legal
interpretations allowed this suspension to exceed the court term.
Still later the convicted person was released on his own good
behavior and individuals, "in the community served as a sort of
surety for his law abidance."
After a period growth, from a few
exceptional cases, legislators were asked to provide a firmer legal
foundation and, "to provide for officials to monitor the released
convicts in their community status."
Probation has now grown into
the most common sentencing option of the courts and as the prison
systems have exceeded their capacity, probation has been called
upon to absorb more severe offenders. The changing nature of the
probation population has then fostered a re-examination of what
probation supervision is and where those resources should be
concentrated.
On Friday October 11, 1985 Christopher J. Mega, the then
Chairman of the New York State Senates Crime and Correction
Committee was to be the Keynote Speaker at the 16th Annual John
Jay Institute in New York City. At the last moment State Senator
Mega however could not deliver the address, entitled, "Prison As An
Alternative To Probation" but his General Counsel did. The speech
demonstrated how the probation system, in New York State,
reversed roles with the prison system and how the probation system
became overcrowded. The significant text of the speech
is as
follows:
I strongly believe that the force behind the current receptivity to expansion of community corrections and new variations on probation is prison overcrowding which has become a national problem. Let's indulge in some nostalgic trivia. In those grand old days of 1960, the police made 41,000 adult felony arrests in the whole state which resulted in 2,500 state prison sentences. The ratio of probation sentences was one to two.
By 1970, felony arrests had zoomed to 115,000 almost a tripling, leading to 3,000 prison sentences and 2,250 probation sentences. Probation sentences still lagged behind the prison sentences and were still manageable in their numbers. By the middle of the decade of the seventies, Governor Rockerfeller's Mandatory Sentencing Laws began to affect the numbers. By 1975, the numbers were 124,000 felony arrests statewide, 8,600 prison sentences and 10,421 probation sentences. Probation now led the list as the favorite disposition of the sentencing judge.
In 1978, the state mandated prison sentences for violent felony offenders and limited plea bargaining to close the traditional loopholes. The impact of this can be clearly seen in 1984 when 150,000 felony arrests resulted in 15,384 prison sentences (almost a doubling) and 16,200 probation sentences. Probation stayed ahead of prison as the most frequent disposition. So within one generation, felony arrests went up 400%, state prison sentences went up 600% and probation sentences went up 1,000%.
Probation classification instruments developed quite recently. They thus had the benefit of the previous research done in the parole field. Probation classification occurs after the "In/Out" decision has been made. Probation thus can afford the luxury of not having to be exceptionally accurate. Probation thus had already been informally classifying offenders without significant criticisms. Probation however had extreme administrative needs not being met by the informal system and it was under criticism to demonstrate its effectiveness. The underlying assumptions and starting points for probation and parole classification systems, within community corrections, were thus very different and they met different needs.
On October 21, 1977 the Comptroller General of the United
States submitted a report to Congress,
concerning the need for
probation and parole to be better managed. The focus of this report
was that field contacts and rehabilitation efforts needed
improvement along with more efficient management of the district
offices
Probation thus had different needs, pressures and a different
time frame. These forces did not react upon probation until quite
recently. Petersilia and Turner note
that, "classification instruments
began to influence probation field services in the mid-1970s ..."
The major instrument to influence the probation field came from
Wisconsin. Citing a State of Wisconsin document,
they note that,
"the probation departments were in serious need of an appropriate
and systematic way to allocate their limited staff resources."
A
LEAA publication notes
concerning the Wisconsin Bureau of
Probation and Parole effort that:
Instigated by a legislative mandate calling for better methods of determining staff requirements and effectively utilizing all staff, the project was launched with special funding from the Wisconsin Council on Criminal Justice.
This is consistent with the popular belief in the probation field, that the Wisconsin state agency was in serious danger of budget cuts unless it could be demonstrated to the state legislature that something scientific occurs on a probation caseload.
Petersilia and Turner also note
that those probation officers, in
Wisconsin, were oriented towards rehabilitation and they were
therefore uncomfortable with a device that forced very frequent
contact levels with probationers likely to fail and low contact levels
with probationers who appeared to be more hopeful. A needs
instrument was thus incorporated into the classification, to allow
higher contact levels with probationers who while not posing a
threat, did require more frequent contact for the purposes of social
casework. It should be noted that this needs instrument was never
empirically validated in Wisconsin nor in other states, such as New
York, which adopted it. Glaser simply notes
that
The Wisconsin system also uses an initial needs assessment form and, every six months, a needs reassessment form, which are derived not from statistics on past experience but from a consensus of agents on the relative importance of various types of assistance that their clients require.
The authors finally note that most probation departments, "now use
a combination of recidivism-prediction and needs-assessment
scores to assign levels of community supervision."
These classification systems were almost all patterned after the Wisconsin system, after the intervention of the National Institute of Corrections. This cloning and mutational replication were primarily the result of the NIC Model Probation System. The Wisconsin system was a total information package of which the risk assessment was only a portion. The advantage of the Wisconsin System was the totality of the management package and this formed the impetus for the rapid spread.
Classification initially was dominated by the clinical method. In
this case an experienced practitioner made an educated judgment
concerning the likelihood of the behavior occurring or not occurring.
This personal expertise was then challenged by the actuarial method
and the clear superiority of this new method, over most individual
diagnosis for groups has been consistent. Gottfredson (1967)
explains this as a difference between, "wide band" and, "high fidelity"
approaches. The wideband approaches includes procedures such
as interviews, projective testing, written evaluations by clinical or
custodial institution staff or a social history report. Wideband
procedures, he notes, have been, "found to be unsatisfactory, by
any usual standards of reliability and validity, for prediction of
specified behavior."
Noting interviews, he states, "repeatedly,
comparisons have shown statistical prediction devices to be more
valid."
Steadman (1982) in summarizing the literature, "ranging
from academic performance to job turnover"
notes that, "statistical
prediction consistently has been more accurate than clinical
predictions"
even for violent behavior of the mentally ill. Monahan
(1981) simply states, "in virtually all of the studies that have tried to
compare clinicians and actuarial tables in predicting the same
events, the tables have proved the more accurate."
For violent
behavior however, he concedes, the results have been mixed.
Of course in the comparisons of predictions regarding actuarial versus professional judgment, the assumption is always that the prediction is made at some early point. It ignores the premise that as time progresses the professional supervision agent amasses more and more experiential data upon which to base his or her decision.
The actuarial method however seeks to draw distinctions between groups of offenders by utilizing selected items from individual offenders' backgrounds. The first major difficulty in the actuarial method is that it seeks to separate the population into two groups by the incorporation of any variable that will enlarge this difference. While it points out extremes, it does not enlarge our knowledge of the factors that impinge upon the offender to cause movement from one major group to the other. Variables are selected that highlight the differences, rather then demonstrate why those differences occur. For example, a major variable that has been used is the offender's prior record. While this item is very powerful in distinguishing between those groups of individuals who are likely or not likely to again engage in criminal conduct it does not illuminate the factors that cause this to be so. The medical analogy to this item would be chest x-ray to determine if tuberculosis is present. While the procedure does effectively determine the condition, it does not shed light on how the condition was brought about. In the area of community corrections, finely tuning prior record to determine those groups likely to recidivate, yields that criminals will be criminals, not how they came to be that way. The focus of the actuarial method is thus to draw distinctions, not to expand knowledge.
The actuarial method, since it seeks power in distinguishing between groups, will use any variable that fits the purpose. Thought is thus not given to establishing a theme that corresponds to a theory of criminology. This open ended tendency toward variable selection then lets the technique open to the introduction of bias in variable selection, because there is no stated framework for variable selection.
While the actuarial method could be used upon sub groups of offenders, it usually is not. The focus has been to determine those likely to recidivate among the whole offender population. Crime is thus viewed, in the instruments, as a general disease that causes all manifestations of the negative behavior. If the focus was to explain criminal behavior then the emphasis would be on more limited theory concerning offender sub groups, with the hope of linking the individual theories at a later point. The actuarial method as now used, assumes that those who specialize in robbery are quite similar to those that special in petit larceny. Obviously, this is not the case. This emphasis on evaluating the whole offender population has a value as a general screening instrument to be used for administrative purposes but limited value as a diagnostic aid for the individual officer or for society.
The literature, in the area of predicting human behavior, has
emphasized the actuarial approach over the individual clinical
approach. No mention has been made of a need to incorporate
current criminological theory into the design of the instrument, within
the community corrections literature area. Theory has thus taken a
back seat to the identification of groups that will probably succeed or
not succeed on probation or parole. The current general risk
assessment literature has however shown that risk assessment is
more an art than a science.
It has also demonstrated that the
various instruments, in use within probation, only explain a small
portion of the variance (R2=.10-.25) and they apparently do not
transport well to other populations. It has also demonstrated that
even the total variables incorporated in the offender's file are not
very accurate in determining the individual's failure potential.
The
most recent literature has also noted that although one impetus to
create the instruments were to ease racial discrimination, the
instruments may actually heighten racial differences.
This then
poses the question of whether certain variables should be retained
for the sake of maximum prediction or discarded because they are
racially biased.
There are obvious questions. If the variable selection process is somewhat open ended, is there any uniform processes occurring or is it random? If no established theories of crime generation are used, are class biases of crime generation used? It is understood, that the choice of variables is highly dependent upon the data available in the offenders' files. Then the question is whether the system, although not acknowledging a theory of crime generation, maybe operating on a class bias theory of crime generation?
To make this inquiry manageable the focus will concern a limited area within this sea of uncertainty. One purpose of classification is to determine offender risk to the community and most classification instruments incorporate a risk scale. The focus will thus be offender risk and the items associated with it.
The Availability Of The Variables
In The Probation Systems
The variables used to create the risk instrument items have a
certain pattern. In the community corrections systems they tend to
be clustered in three main areas. These areas include the
individuals prior record, past system performance and program
needs. Probation systems record the arrest records of the offenders
processed through the system. It tends thus to emphasize those
records when it records information or talks about an individual
currently within the system. Case records are also concerned with
the past performance of the individual, in various parts of the
system. The probation system is composed of both Peace Officer
and Social Work adherents. While no one officer is either a true
Peace Officer or Social Worker, it is helpful to view them in terms of
the theoretical types.
The Peace Officer theoretical type is
concerned with the protection of the community. This is
accomplished by monitoring the offender. The Social Worker
theoretical type is concerned with successful modifying of the
offender so they are no longer in conflict with societal norms of
conduct. Variables of these two types are thus included in the file
records.
The social work variables are associated with a concern for program referral and are reflections of middle class values. The abuse of substances is thus recorded with a history of previous program performance both success and failure. Employment history is also noted and the need for future training or referrals. Mental Health diagnosis is also noted and recommendations for future and past referrals to mental health agencies.
As Related To The Specific Orientation
How the variables are selected, for future study or for inclusion or retention in a the agency's data base or case records, relates to the area of interest and what they want to protect or change. If the focus is individual centered, then the study will focus on items such as individual test scores, personality types or measurable mental attributes. If the focus is psychiatric classification then various diagnostic schemes would be the primary organization of the case record.
If the focus is social work then the emphasis will be programs
that can change individuals to a more productive middle class
lifestyle.
Information collected might include: an employment
history with work skills noted, substance abuse problems and
individual counselling efforts. The focus of this information will be to
determine the extent of the individual's deficit, program referrals,
potential referrals and previous performance.
A new field of interest related to the processing or prediction of
individuals contained within the criminal justice system might collect
even more divergent information.
If the focus were biological, such
as how chromosomes affect criminal behavior, then the information
collected would most assuredly contain medical information relating
to chromosomes, blood type, medical history, etc. With a focus on
physical characteristics, then entirely different information would be
collected. To predict criminal behavior by the bumps on one's head
then some mapping scheme for the head would be needed and their
relationship to previous or current criminal justice involvement.
Group centered approaches include such information as the individual's clique, party or congregation, etc. Sociological orientations would include relative information on the individuals social place with the system and broad based demographic information. Such an approach would also be theory based and indicate how the external force of his society or group brought him into the criminal justice system.
The probation system has elements of all the above but it as
has its own focus. The probation system has information that only it
controls. It then naturally tends to highlight that information. Such
exclusive information includes the individuals' arrest record and
related processing information through the system. It also includes
information concerning past successes or failures in the system
parts. The probation system is however program oriented in a
certain social work tradition
. It thus notes an individual's program
need regarding how that individual may be changed or modified to
be incorporated into our social system. Such programs could
include job training to render them employable, or substance abuse
programs to terminate their use of chemical agents.
The leaders of the probation systems are concerned about its image. The systems will thus monitor and track or totally ignore those aspects that might pose an embarrassment. An example of this would be violations of probation. To the peace officer adherents a violation of probation is a success. It demonstrates the systems ability to monitor probationers and report back to the court when a probationer is not performing according to the courts mandates. To the social work adherents a violation of probation is a failure because they have failed to change the behavior that the court has specified. In systems dominated by the social work view, violations of probation are not extensively studied. In systems dominated by the peace officer view, violations of probation would be monitored as an indicator of success. Violations and their subsequent revocations of probation are thus touted as a program highlight in Peace Officer dominated systems and not emphasized Social Work dominated systems.
Monahan has noted a number of variables that are related to
violent behavior
The research studies on the statistical prediction of violent behavior have yielded a wide variety of results, ranging from substantially less accurate to substantially more accurate than the studies of clinical prediction, depending on what criterion of violence was used. The factors most closely related to the occurrence of violent behavior appear to be past violence, age, sex, race, socioeconomic status, and opiate or alcohol abuse. Estimated IQ, residential mobility, and marital status also are related to violent behavior. Mental illness, however, does not appear to be related to violence in the absence of a history of violent behavior.
our results indicate that the type of individual most likely to return to prison (and most likely to have a small value of time until recidivism) is a young, black male with a large number of previous incarcerations, who is a drug addict and/or alcoholic, and whose previous incarceration was lengthy and for a crime against property.
Monahan has noted
"If there is one finding that overshadows
all others in the area of prediction, it is that the probability of future
crime increases with each criminal act." This fact has indeed
overshadowed all others in the area of criminal justice processing of
cases. It is the main item that both defense and prosecutors
examine and it is the primary item also viewed by the judiciary. The
modeling of this variable, within the area of research however, has
not mimicked the real world of the criminal justice system.
The criminal record of an individual denotes only the instances where he was arrested and possibly processed through the system. It does not reflect the times he was successful at eluding the system. It does however provide an indication of the onset of the illegal behavior and the extent of the behavior.
Even as limited as are the current research schemes concerning
this variable, some interesting analysis is still possible. Criminal
history items demonstrate the power of using the past performance
of the individual, to predict future behavior. As Wolfgang notes,
...if a person is arrested four times, the probability that it will happen a fifth is 80 percent. If a person is arrested 10 times, the probability of an eleventh arrest is 90 percent and the probability that the offense will be a serious or 'index' offense (although not necessarily a violent one) is 42 percent.
This correlation is apparent to members of the criminal justice system. It tends to explain the processing of cases, within the system, concerning case outcomes.
The Vera Institute of Justice noted
this processing bias in the
past.
Defendants with heavier criminal histories were more likely to be convicted and, if convicted, more likely to receive heavier sentences than those with lighter or clean records. Seventy-seven percent of convicted defendants with no prior record avoided jail or prison; only 16% of convicted defendants who had previously been sentenced to prison were as fortunate.
Without the experience data, noted previously (Wolfgang), it is sometimes believed that those with more severe records are somehow being treated unfairly. In actuality, the information is only demonstrating a selection process that indicates increased emphasis for those cases that are deemed to have a higher threat level in the future.
There is a problem in the use of legal history information for prediction. While most studies concede that it is an important variable, this writer believes it is not being properly used. The current systems merely count offenses or they record a limited period, such as time since last offense. Thus much information is lost. When a District Attorney or Judge examines a criminal history, they are seeing relationships that are not being accounted for in the present day research methods. The slopes of the offense pattern, the timing of offenses between each other, and characteristic patterns of offenses, yield a good deal of information. They are however ignored in most research schemes. The problem, in this area, appears to be the devising of an Artificial Intelligence (AI) algorithm to mimic the decision process. Too frequently the information is structured for an already conceived research tool such as SPSS.
If we were to examine an extensive record the initial question that comes to mind is whether negative behavior is increasing or decreasing; or whether the behavior has been consistent during the time interval. Current modeling of this item has not been effective. Simple counts of the number of misdemeanor and felony arrests do not provide information whether the severities of the offenses are increasing or decreasing. There is a vast difference between an individual who started his criminal career with three misdemeanor arrests followed by two felony arrests and an individual who has two felony arrests but then three misdemeanor arrests. If we also speculate that the individual who started with the misdemeanor arrests had a criminal record that was compressed into a two year period and the individual who started with the felony arrests had a record that spanned ten years then we can see a remarkable difference between these two individuals. If we further speculate that the individual who started with the two felonies had a two year lapse of activity after the felonies, followed by the two three misdemeanor arrests and then no activity for a number of years the difference is even more remarkable. Currently such timing and slope information have not been incorporated into the analysis of criminal records by researchers. Such information is however apparent to prosecutors, judges and other members of the criminal justice system. Granted, the application of such knowledge is another matter.
Most serious crimes are predominately a male affair and while
female offending has risen over time, males still dominate the
activity
. In 1987 while there were 16,714 arrests for murder only
2,085 involved females. Thus females accounted for only 12.5
percent of the offenses. For robbery there were 123,306 arrests and
only 9,964 involved females, which reflects only 8.1 percent of the
total, for females. Even when only property crimes are considered
females account for only 24.4 percent of the total. Only in the area
of forgery and counterfeiting (34.4%) and fraud (43.5%) does the
female rates even start to approach their male counterparts.
Criminal activity in general is thus still primarily a male activity.
Race is another variable that is relatively easy to verify at all
stages within the criminal justice system. Race is however a
variable that is fraught with controversy and dilemma. Nationally, in
1987 blacks were the largest racial group involved in crime.
Concerning violent crime they represented 52.5% of the total of
399,133 offenses and 63.2% of the property crimes
. The logical
assumption from such high ratios, is that if you knew the individual
were black it would add something to the prediction of future
offending. Various arguments to this relationship have been
presented. One side has argued that the system discriminates
against blacks and more arrests and convictions thus result. The
opposing side has argued that there is simply more serious crime
amongst blacks. Silberman states the problem very well.
In the end, there is no escaping the question of race and crime. To say this is to risk, almost to guarantee, giving offense; it is impossible to talk honestly about the role of race in American life without offending and angering both whites and blacks - and Hispanic browns and native American reds as well. The truth is too terrible, on all sides; and we are all too accustomed to the soothing euphemisms and inflammatory rhetoric with which the subject is cloaked.
Most researchers exclude it for moral and ethical reasons and
others such as Schmidt and Witte
argue that it must be included in
the initial research. They argue that this variable can be excluded
later with better results. They present the view that it is necessary in
the initial research to determine the explanatory power of the
variable and to determine if any other variables are highly correlated
with it. If this is not done, then race many be officially excluded from
the model but replaced by another variable that measures race in an
indirect fashion. The measurement of race is thus hidden from the
public view but still used. This substituting of a variable that
correlates highly with race, should be more offensive then actually
incorporating it in the instrument. It is usually however not readily
apparent. If it was included in the initial research then when the
finished model is built it can also be determined if any significant
explanatory power was lost by the exclusion of race related
variables.
Most studies attempt to use this variable, in various degrees.
Indeed the list of studies that offer correlations between income and
recidivism is lengthy. Some logical anomalies in the applicability of
this variable however do exist. As Silberman notes
As a group, New York's Puerto Ricans are poorer than its blacks. The median family income among Puerto Ricans is 20 percent below the black median, and the proportion of families officially classified as poor is half again as high. Puerto Rican New Yorkers have less education than blacks, and a larger proportion hold menial jobs....If violence were a simple function of poverty and social class, therefore, one would expect as much violent crime among Puerto Rican and other Hispanic residents of New York as among black residents. In fact, the rates are strikingly different. According to an analysis of police statistics by David Burnham of The New York Times, 63 percent of the people arrested for violent crimes in the period 1970-72 were black, and only 15.3 percent were Hispanic.
The above example demonstrates that simple measures are not adequate to the task. While income, education levels and employment do provide indications, the picture is incomplete. Better measures are needed in this area.
Alcohol and Substance abuse are popular variables for the inclusion in models. They represent the middle class view that if only these individuals did not abuse these terrible substances all would be different. Not all scenarios however list alcohol and substance abuse as the cause of crime. Many census tracks in some cities contain a very high proportion of individuals who abuse these chemicals but obviously not all these individuals are involved in criminal behavior.
If a variable reaches across forty to fifty percent of the cases and the failure rate is say ten to twenty-five percent, it ceases to be predictive. For now almost half of the cases have it but less than one fourth fails. It therefore ceases to discriminate. In some probation systems alcohol abuse approaches the sixty percent level, it therefore cannot be a predictor. It however is a popular item because many individuals see the "demon rum" as a source of crime.
Barbara Critchlow examined the powers of "John Barleycorn" in
an extensive review of the literature
and questions some of the
popular beliefs and more important, for this work, the basic belief
that alcohol and crime are causally linked. She notes that logically
the range of behavior attributed to alcohol cannot exist
The effects of alcohol on human behavior are extraordinary. Under its influence, strong men cry, the week become brave, enemies become friends and friends enemies... history and popular culture show us examples of inebriates doing everything from chopping up their children to treating an impoverished Charlie Chaplin to dinner.
Typically such ranges of behavior vary by culture and by time periods within cultures. The attributed effects are thus culturally determined. Even within our own culture "John Barleycorn" has been transformed from a substance from God that was used very heavily by our founding fathers; to an evil substance that should not be used at all, during the temperance movement. She notes that part of this transformation was attributed to an altered view of the causes of antisocial behavior.
While it is not within the scope of this thesis to examine this
concept in depth it should be noted that excessive drinking excuses
negative conduct without permanently destroying the moral standing
in the community.
It may thus be this factor that explains its widely
sighted use during the commission of offenses and its poor
predictive powers, related to use and abuse, in the risk tools.
Between The Defendant And The Victim
A variable that is seldom used in prediction models but is
acknowledged to affect case outcome processing and by implication
recidivism, is the relationship between the defendant and the victim.
As the Vera Institute of Justice notes
Much of what we found was startling. In half of all the felony arrests for crimes against the person, the victim had a prior relationship with the defendant. Prior relationships were frequent in cases of homicide and assault, where they were expected, as well as in cases of robbery, where they were not. Even in property crimes, prior relationships figured in over a third of the cases. This unanticipated level of prior relationships proved significant to the outcome of cases.
In the total victim felonies
studied 47% of the victims had a prior
relationship
with the defendant
.
This led to high rates of dismissals and charge reductions. [Specifically they noted] At the root of much of the crime brought to court is anger - simple or complicated anger between two or more people who know each other. Expression of anger results in the commission of technical felonies, yet defense attorneys, judges and prosecutors recognize that in many cases conviction and prison sentences are inappropriate responses. High rates of dismissals or charge reduction appear to be a reflection of the system's effort to carry out the intent of the law -as judges and other participants perceive it - though not necessarily the letter of the law.
The age of the offender is another relatively easy variable to
collect but difficult to model. In general it has been known that, "As
violence feeds on the energy of youth, so age mellows even the
most habitual offender."
This yields a U shaped relationship. It is
also known that, "not only one's current age, but the age at which
one first comes in contact with the police, appears to relate strongly
to criminal behavior.
" It has also been known that various crimes
are age specific, in that, certain crimes tend to be committed by
different age periods. For example, in 1975 and 1976 youths under
the age of eighteen accounted for almost half of the arrests for
burglary.
It has also been known that the age specific characteristic
of the type of offenses has been somewhat stable over time.
English statistics indicate that burglars and robbers of the fifteenth
and sixteenth centuries were approximately the age of the
contemporary burglar and robber.
The major problem in modeling this variable has been that the age relationship is not symmetrical over the entire crime spectrum and that the relationship is not a linear one. It is thus difficult to model if all offenses are included and the non linear aspects are not controlled for.
A major deficit of prior research has been the modeling of this
variable. Experience by this writer has demonstrated that the
relationship of age to the sanction imposed is not a linear function
but a curvilinear one. Research by Wheeler, Wisburd and Bode
(1982) has also illuminated this point. They state, "The relationship
of age to the sanction is very curvilinear..."
Their assumption is that
it is modeled better by a combination of age + age squared, has yet
to be fully demonstrated.
The variables used in the development of the major instruments have been remarkably stable. Unless change is introduced into the system, they will continue to be used into the foreseeable future. In the next chapter the methods used to devise the major instruments will be examined. In that chapter it is noted that no one method has demonstrated its superiority over any other method. An unchanging complacency has thus evolved in the development of the instruments. This has led to a satisfaction with the status quo, a confidence in the current state of the art.
METHODS OF DEVISING RISK INSTRUMENTS
Monahan
, agreeing with Meehl
, lists four types of prediction
related to a clinical judgment: clinical data combined clinically,
clinical data combined statistically, statistical data combined
clinically and statistical data combined statistically
With clinical data combined clinically persons are assigned to diagnostic categories, on the basis of the clinical judgment and on the basis of that assignment certain expectations of their behavior can be made. In clinical data combined statistically the probabilities of negative behavior are assigned to that specific diagnostic category.
Statistical data combined clinically is operative in the case of predictions made on the basis of psychological test scores. What we are concerned with is the type of prediction that occurs when statistical data is combined statistically to yield a prediction. The simplest form of this type occurs with an insurance company actuarial table.
In the actuarial table, items are grouped into categories that yield predictive powers. The most renowned of this type is the actuarial table listing life expectancy and current age. In this scenario for each age listed on the chart, the insurance company has computed the usual remaining years of life for each age group. Burgess, as noted in the review of the literature, also used this method to assign the probabilities of parole success, based upon certain characteristics. What is of interest in this chapter is the design of an instrument that takes into account many prescribed variables and can place the individual in many categories of success or failure. For practical reasons in the risk assessment instrument method, cutoff points are arranged to limit the number of potential categories.
Two major types of variables are used in the design of the instruments and two major categories must be established before an instrument can be devised. The goal of most classification schemes is to identify groups of individuals so services, procedures, or more extensive classifications can be made. The goal of most predictive schemes differs from normal classification in that it attempts to identify groups that will in the future possess some specific characteristics. The idea is thus to determine classifications for some future point in time. It is thus necessary to first determine the purpose for which the instrument will be used. Many times a multiplicity of reasons will be found but clear theoretical groupings should be established at the outset.
Two types of individuals must then be theoretically established. Those that will succeed and those that will fail. The real question however is failure or success at what. It is thus essential to develop clear indications of failure and success. For there are individuals who violate some specific technical rules and community supervision is revoked. There are also other individuals who are arrested and convicted of new crimes and those that commit heinous crimes as well. If failures were defined as all of the above then how does one distinguish between the individual who will fail to remain employed and the individual who will murder someone during a robbery?
There are however advantages and disadvantages for grouping categories of failure and for very limited definitions of failure. For the more heinous crimes the percentage of those individuals who commit them is very small. It is thus very difficult to establish accurate indicators because the sample of cases is very low. By combining categories of failure the failure pool is increased but the indication of failure is blurred. A paradox thus results. As the focus of the specific negative acts is narrowed the act specific accuracy is increased but overall instrument accuracy is diminished because the failure pool has decreased. As the range of negative acts is increased the specific value of dangerousness is decreased, because the number of definitions of failure has increased, but the overall instrument accuracy has increased because the failure pool has increased. This paradox must be resolved and a reasonable balance established before a meaningful result can be achieved.
It is thus best to establish clear ranges of categories, of failure and success, before proceeding. Such a system might include those with the highest occurrence within the population to those that are the least frequent.
The establishment of the definition of failure for each category of failure is referred to as the criterion. Once the failure criterion is established, it is then possible to establish the variables that are related to that category. In theory any variable might be selected to distinguish between various categories of success and failure.
|
PREDICTION YES |
PREDICTION NO |
CONDITION YES |
TRUE |
FALSE |
CONDITION NO |
FALSE |
TRUE |
1Predictions and conditions which are true and false
As Clear notes
most instruments contain less than ten
variables and it is
suggested that at
least 50 cases be
used for each
variable. A sample
size of 500 cases is
thus adequate for
most studies. Some
disagreement in the field however revolves around the verification of
the statistical results. Some recommend that the initial sample be
divided into a construction sample and a validation sample, while
others urge that a new sample be drawn. Both samples should be
greater than 500 cases, however.
For the instrument to be useful it must provide information beyond the base rate and to be able to discriminate fairly well. The base rate refers to the level of dysfunction normally occurring in the population for that category. The ability to discriminate fairly well refers to the false positive and negative issue.
In any study of projected criminal behavior some individuals will be correctly identified and some will not be. This difference has been defined as the false negative and positive issue. The accompanying figure illustrates this issue. Some individuals will be predicted as having a certain condition at some point in time. Later if that condition is found it can be said that the prediction was true. Some individuals will be predicted as not having a certain condition at some point in time. Later if that condition is not found it can be said that the prediction was true.
In any study however some individuals will be incorrectly categorized. Some will be identified as not possessing the negative characteristics but will have them. These are those which are labeled false negatives. Some will be identified as possessing the negative characteristics but will not have them. These will be identified as being false positive.
|
PREDICTION YES CONDITION |
PREDICTION NO CONDITION |
CONDITION DOES EXIST |
TRUE POSITIVE PREDICTION |
FALSE NEGATIVE PREDICTION |
CONDITION DOES NOT EXIST |
FALSE POSITIVE PREDICTION |
TRUE NEGATIVE PREDICTION |
In the accompanying chart this relationship is expressed. Conditions which were predicted and which were found are noted as true positive. Conditions which were predicted and for which the condition was not found are noted as false positive. Conditions for which the condition was not predicted and for which no condition was found are noted as a true negative. A condition which was not predicted but for which the condition was found is noted as a false negative.
This relationship can be seen in an illustrative example of the development of two hypothetical instruments called the Absurd One and the Absurd Two. In this example the problem is the potential release of 100 individuals with a history of X antisocial behavior and a predicted base rate (rate of normal occurrence) for X behavior, within this group, of 10%. For this group then, the developers devise a question such as "is he alive." All the members of the group then having been found to be positive on that item are also found to be in danger of committing X behavior upon their release. After they are released it is then determined that 10 have committed X behavior, while in the community. The developers of the Absurd One then pat themselves on the back for correctly identifying 100% of the individuals in danger of committing X; however, they had to concede that with the current state of the art a 90% false positive rate had to be accepted with the 100% success rate in prediction.
In the next example, the same developers, with the same problem, want to identify who will not commit X upon release. The Absurd Two is then developed and the cutting question "is he breathing now" is asked. As the result of this test all 100 are found not to be in danger of committing X upon their release. After the release and it being found that 10 have committed X in the community, the developers again pat themselves on the back. The new instrument called the Absurd Two has just demonstrated that it correctly identified 90% of those not in danger of Committing X, with only a false negative rate of lO%. Most important was the announcement that this incredible instrument had a false positive rate of 0%. A truly remarkable instrument, as indicated by the resulting numbers.
As the above examples illustrate, of the Absurd One and the Absurd Two, illustrate the beauty of the instrument and the ratio of false positive to false negatives is in the eye of the beholder and how they want the results viewed.
As Monahan notes
a cutting score is nothing more than a
particular point on some objective or subjective scale. For example a
setting on a thermostat of 70 degrees, is simply the point at which the
heater runs if it is below and for which the heater does not run if it is
above it. It is thus used as a selection point for some action.
In most applications the cutting score is used to determine what action will be taken based upon the prediction score. This action could either be increased supervision levels or decreased supervision levels. It could also be the release from custody or the retention of physical restrictive custody.
In any method of development the level of correlation on the validation sample will be less than the level of correlation on the construction developmental sample.
Overview Of The Methods Most Used
The majority of the reviews of the current methods available have been done by the Gottfredsons'.
Stephen and Don Gottfredson,
in a review of the major methods
of Burgess, Multiple Regression, Association Analysis, Predictive
Attribute Analysis, Multidimensional Contingency Analysis determined
that:
These results suggest no apparent distinct advantage to any single method. Predictive validity given all these methods of instrument development were at best modest, although prediction was better than would result from simple use of the base rate alone, regardless of the method of construction employed.
The authors thus suggest that decisions concerning which method to use be based upon factors other than the specific statistical method.
All items have the same theoretical weight and this is best represented by the work of Burgess (1928). In this scheme a cutting score is used to determine the predictors. The method is simple and straightforward.
The best known method is of Multiple Regression Analysis. This method seeks to indicate a relationship by minimizing the sum of the squares' errors and drawing a linear relationship.
These models attempt to overcome some of the difficulties with model assumptions that the subgroups of the population are homogeneous and where interaction terms are accounted for. Most noted among these techniques are Predictive Attribute Analysis and Association Analysis.
The following methods have not been used in any major instrument but offer promise for the future.
Survival time models are also known as failure time models.
They seek to examine the time interval until an event occurs. Such
analysis has been done in product failure research in the electronics
industry and in the medical area and it has also been used to analyze
the survival time after treatment. The best example of this type of
model is that of Schmidt and Witte.
These authors cite the National
Academy of Sciences' Panel on Research on Criminal Careers as
stating:
failure-rate models have been applied rarely to criminal justice prediction problems...and so the extra statistical power they can provide in separately predicting frequency rates for active offenders and dropout rates are not yet widely understood and appreciated.
Survival models have merit because they supply additional information. In the current methods the only prediction is they will exhibit the negative behavior. In the survival models the additional information is when they will exhibit the negative behavior. This additional information can be crucial to the development of specialized programs to avert the negative behavior.
Artificial Intelligence (AI) is a broad term used to define the method of construction of the system. The desire is to define how people think and reason. The final product built from the tools of AI is usually referred to as an expert system. Such commercial applications have only become available since the early 1980's.
Artificial Intelligence is a subfield of computer science that seeks to develop computers which function as human intelligence does and yield an expert system. Two prime components are needed for the system. A knowledge base contains the facts and rules of the system. An inference engine then selects and executes the rules. The result is an expert system that is capable of considering a large volume of knowledge and then recommending a course of action.
Artificial Intelligence is a tool that has yet to be used in the
development of a risk instrument. It is based on the assumption that
human reasoning can be mimicked, human reasoning can be defined
in rules and that these rules can be structured in a computer. All that
is needed for its development, in the risk instrument field, is the belief
that established statistical predictive results can be merged with an
expert officer's knowledge to yield an expert system. Such a system
should, in theory be superior to the instruments now in use. For as
Gottfredson notes, "prediction devices, developed by any method,
can do no more than summarize experience."
A number of basic policy decisions must be made before an instrument can be devised for an agency. These decisions reach to the very core of the potential instrument and must be decided before any of the initial work begins.
Primary to the development of the instrument is the purpose for which it will be used. Will the instrument be used to make predictions for groups of individuals or specific individuals? This decision is very important because it addresses how the instrument will be used. In the group sense it could be used to determine the functioning of various programs and procedures. In the individual sense it could be used to determine release from custody status or the degree of supervision required.
Another central point is whether the instrument be used to select for greater or less restriction or control or punishment. This issue also speaks to the very core of the instrument because there is a vast difference between both views. One can assume that the system is overcrowded and we are going to select our best candidates for early release. One could also say that the problem is the core of individuals who commit serious crimes must be further restricted. The decision of which course to follow will determine all further decisions and the potential liabilities, evaluation research and potential success or failure of the method.
The last question that must be addressed at the very outset is whether the instrument will contain negative or positive items. As the term implies some instruments contain items that imply by possessing those characteristics' one is made a better risk. Other instruments, based on negative items, imply that the possession of some specific characteristics makes one a poorer risk. In either case some justification will have to be given and it is best to establish the best course of action at this initial point.
A host of practical decisions must be made also before even a statistical method is chosen. These decisions should be made with the potential research team at the outset and they should have been fully formulated by the agency administrators at the outset. These practical decisions will determine the final form of the instrument. Some decisions cannot be determined by the research team because the information is only available from the agency administrators or because the information will have the consequence of delineating how the agency views itself or how it wants itself to be viewed. It thus is essential for the agency to determine those choices rather than abdicate them.
The initial starting point is the policy decisions made for the instrument. These decisions include whether the instrument will be used for: groups or individuals, greater or lesser punishment and whether it will contain negative or positive items. These decisions are important for a number of reasons. These decisions will set the data collection framework for the future and it is essential that the agency be aware of the potential consequences of the decisions. Some of the policy decisions will have to be defended and some will not have to be.
The central practical decision is what is the focus of the proposed instrument? Why is it to be developed? Usually the publicly stated reasons will be that it will enhance the functioning of the agency or a specific program. Most agencies will express the official or public reason for the instrument's development but it is very important for the researchers to know the reasons that are not publicly stated.
The central problem must then be viewing the instrument in the context of what is the central mission statement of the agency. It then must be determined, how does this instrument relate to how the agency performs the mission?
Will the instrument be developed according to a theoretical structure that is related to the mission statement of the agency or will the agency allow the development to be non theoretical? This is an important decision. If the agency is operating on some basic theoretical framework then the items that are collected in the data base and case folders already are organized with that structure. It thus is a simple matter to organize the instrument items along questions that everyone accepts and has deemed important. The primary questions can relate to the primary theory and secondary questions can relate to secondary theories.
If the agency is not however operating on any stated theory then the question is whether the instrument will espouse a theoretical position. If this is the case, then the items selected must correspond to some theoretical structure.
If the agency is however operating in the blind and no theoretical structure has been established and they do not wish any theoretical links to be established then the task is much more difficult for the researchers and usefulness of the results will be more limited in nature. Correlations that are less than optimum but validate or invalidate some central theoretical belief, that the agency's mission is based upon, can have a greater impact than just the instrument. The agency might however be operating on a popular belief structure that has never been tested and which they do not wish exposed. In this case a non theoretical stance will be mandatory. In such a system then any item that has been collected or can be collected as data is available for inclusion in the instrument.
In this chapter the methods of devising Risk Instruments were reviewed. As noted, no one method or technique has gained an advantage over any other as the method of choice by developers. A few promising techniques are looming on the horizon but they have not yet been used to develop a next generation major instrument. In the next chapter it will be explored how some of these techniques were used to develop the major instruments.
A BRIEF HISTORY OF THE MAJOR RISK INSTRUMENTS
The risk prediction field in corrections currently has two instruments that have set the standard upon which all the others evolved. The two instruments are the Federal Salient Factor Score and the Wisconsin Risk Instrument. These instruments have indeed been the bases for many other instruments and their impact will continue into the next generation of instruments that will be developed. It is with that in mind, that no description of the current state of the art can be complete without them, and no agenda for the future can be formulated without knowing why they developed and how they developed.
Long after Burgess developed his scale
and before Ohlin
began his work, Hakeem
summarized the current state of the art for
the mass corrections audience. He noted
...substantial beginnings have been made in the theoretical aspects of the subject, but the practical application of prediction devices by parole and prison authorities and experts has been neglected seriously.
Hakeem noted that the Committee on Standards and
Procedures in Parole Selection and Release of the National Parole
Conference (of 1939) encouraged the construction of predictive
tables and that some practical applications were in use. Specifically
he noted that the State of Illinois utilized the services of a
"sociologist-actuary"
to compute the individuals' statistical chances
on parole for the parole board.
He then cited two minor and one major reason why further
practical progress had not been made. The first specified that the
field was after all not familiar with statistical prediction techniques.
The second concerned the impression that the field was always
conservative. The major reason cited
however was
Probably the principal reason prediction devices have hardly been applied in practical situations is that there has been, after all, no startling and dramatic demonstration of their predictive capacity.
A number of objections also existed at that time. Foremost was
an objection to the very concept of statistical prediction. This
objection would later persist until the development of instruments and
beyond. He notes this objection as
...statistical instruments which attempt to predict human behavior are incompatible with the principle of individualization which has been regarded as very fundamental in social casework. Indeed, the assumption is made that the only possible basis for the appraisal of a case and for prognosis is through the individualistic approach and that the individualistic approach has no relationship to statistical computations.
Hakeem, quoting Lane
, also notes that the usefulness of such
devices is unclear, unless they can predict the optimum time for
release.
Even in 1962, a decade before the start of the development of
the Salient Factor Score, the thoughts were on parole prediction
methods. The July 1962 volume of Crime and Delinquency was
totally devoted to the topic of Parole Prediction Tables. At that time
according to a survey conducted by Evjen
, the only jurisdictions that
stated that they were using or developing prediction methods were:
Illinois, Ohio, The California Youth Authority and Colorado. Illinois
had been using them for almost thirty years and it was "the only
state" in which a routine system of parole prediction had been
established.
Ohio had been using a system based on the MMPI (a
psychological test) but there was no indication that it was routine.
The California Youth Authority and the Department of Corrections
had begun developing the "base expectancy scores" but they were
not yet completed. Colorado was also developing prediction
statistics. Of the group, only the base expectancy scores would have
a major impact on future development.
Evjen's survey also illustrated a number of comments. Of
interest is the belief that the proposed instruments, since they predict
failure without taking into account the reforming experience of the
correctional system, would be seen in a very unfavorable light by
correctional system administrators. The essential logical conclusion,
if they work, is that the correctional experience, whether institutional
or community corrections are of little or no effect. This usually
unstated objection however was noted, in passing by Evjen
A career prison administrator writes: 'Are we to assume that what happens to a man in the institution and the type of supervision he receives on parole has nothing to do with whether he succeeds or fails?' Any intelligent selection procedure, he emphasizes, takes these factors into account.
Another one of the very interesting comments listed by Evjen, from Paul W. Tappan a professor at the New York University School of Law and formerly the Chairman of the U.S Parole Board, lists a number of objections. Most notable are the objections related to the projected "no effect" by the correctional system, the lack of an indication of the seriousness of any parole failure and the administrative need for not just prediction but for administrative usefulness, such as the optimum time for institutional release.
In Tappan's first paragraph, he notes how helpful the
instruments could be but expressed reservations. The operational
philosophy of the instruments is that they separate success and
failure into two major groups. He was more concerned not with the
cases which would obviously succeed or fail but those "gray area"
cases in between. He then writes
I am critical of the criteria employed in most of the parole studies because they relate so little to the quality of the offender's experience during the correctional treatment, both in prison and under supervision. Since most prisoners must return to the community and ideally should do so under parole supervision, it is clearly desirable that prediction instruments should be developed to determine the optimum time of release rather than merely the risk of failure without regard to the time of release. It is also true, I believe, that the prediction studies have given insufficient attention to the question of the seriousness of the offenses that the prisoner has committed in the past and may commit in the future. Surely it is important to know not only the likelihood of his getting into trouble again, but how serious his infractions may be if he is paroled.
Harry C. Dupree, the Chairman of the Army and Air Force
Clemency and Parole Board at that time, made some interesting
comments reflecting the general resistance to the adoption of a
parole prediction instrument by parole boards. Given his exposure to
members of other boards his resistance should be noted. Specifically
he states
I do not believe the field of parole is ready at this time to undertake the use of prediction procedures. There are thirty or more parole boards on which members serve part time. Of the remaining twenty-five parole boards with which I am acquainted, it appears there are not many members who have accepted the claims of the specialists who devised the prediction tables that they could be used advantageously. These members, in my opinion, do not accept the somewhat automatic prediction in place of their appraisal as to the parolability of inmates.
He then goes on to state that most parole officials are more concerned with obtaining more staff, whether they be officers, supervisors or clerical help rather than committing resources to develop prediction services.
There thus existed at that time a number of specific prejudices against the use of the potential instruments and specific general resistance to the usurping of the boards' discretion and powers. It appears that these prejudices were not easily attributed to a lack of familiarity with the subject matter, nor a pervasive cultural lag in the field. If adopted to identify those cases that had a higher likelihood of failure, they would possibly do that but at the cost of usurping the boards' discretion and powers.
The argument that statistical instruments are incompatible with the principle of individualization, which has been regarded as very fundamental in social casework, was after all an essential core item. The boards had been envisioned with the belief that the release decision was an individual affair and that the release decision required the board members to examine every individual case for unique details. The concept that this decision could be replaced by group statistics would have been foreign and oppositional to the very concept of a board examining individual cases.
Two measures were necessary to reduce the resistance of the boards. Parole boards have, after all, three essential questions that they must deliberate. The first question is, should they be released? The second question is, when should they be released? The third and final question is, what is the possibility that he will commit a crime that will make the first decision a bad one? The hope that the forthcoming instruments could determine the optimum time of release, the seriousness of the offenses that the prisoner has committed in the past and may commit in the future, was apparently the only way the boards were going to allow this burdensome reduction in their discretion and powers.
At this point however, nothing would yet happen because there was no demonstration of their helpfulness and no compelling reason to implement them.
The Salient Factor Score is the premiere instrument in parole.
It is the instrument used for the federal parole system and it was
indeed the major start of risk instruments. The instrument however
was not a planned instrument in the sense that it was envisioned and
then designed. Gottfredson noted
that it was part of a larger study
with the U.S Parole Board. The general goal was to improve decision
making by that board by providing them with a general policy and to
help decision making on individual cases.
Of the larger guidelines portion, the Salient Factor Score has been favorably viewed as
The primary model for development of parole guidelines
during the past decade has been the Federal Parole
Guidelines. Developed over a three year period beginning in
1972, the guidelines evolved into a decision matrix instrument
used to assign an estimate of the time an inmate could be
expected to serve before release....There is good reason that
the federal guidelines system has served as a model for
parole boards. It is a relatively simple, straightforward system
based on sound research.
The guidelines are a two dimensional matrix which evaluates the severity of the offense which brought the offender into the system and the parole prognosis, which is the Salient Factor Score. By locating the offense severity, which ranges from low to greatest in six steps and parole prognosis in four steps from poor to very good, the expected length of incarceration can be achieved.
Glaser notes
that the there was an accelerated rate of
acceptance, for statistical tables during the 1970's and 1980's. He
cites two main reasons for the effect. The tables were now projected
as advisory information, that frequently contained other uses and the
end users were involved in the design. He notes that the beginning
of this change occurred circa 1960 with two researchers for the
California Department of Corrections. At that time Don Godfredson
and Kelly Ballard developed tables for the parole board which
showed which prisoners would be paroled if consistent pass practices
where applied.
Later the National Council on Crime and Delinquency, with the
aid of federal grants, brought together virtually all the parole boards
to examine their predictions with hypothetical cases and statistical
tables. Glaser further notes that an outgrowth of this process was the
Uniform Parole Reports Program, "which helped members of different
parole boards reach consensus on procedures to compile uniform
statistics."
The major turning point, according to Glaser, was the
development of the Federal Parole Guidelines. After it many other
scales were developed in many different areas. Glaser notes that the
prompting of the members of the U.S Parole Board to work with Don
Gottfredson and statistician Leslie Wilkins was precipitated by an
incident.
This action was prompted by a federal court decision that ordered the board to articulate its policies for granting parole (Childs v. United States Board of Parole, 371 F. Supp. 1246 [D.D.C. 1973], modified, 511 F. 2d 1270 [D.C. Cir. 1974]).
One notable effect of our system of government is seen in the separation of governmental functions. It is thus very difficult for entropy to occur. When one system becomes complacent, often another branch will compel it to action. In the example below, the executive branch is compelled to change because of a reinterpretation by the judicial branch.
The case that started this change was Morissey v. Brewer.
The case dealt with whether due process applied to the parole
system and specifically with parole revocations. The Supreme Court
agreed that it did because a loss of freedom could result. The
requirements, of due process, were not the same as a trial but a
factual hearing was deemed appropriate. Later this concept was
extended to revocations of probation.
If we examine the action brought about by Wallace Russell
Childs Jr., et al., against the United States Board of Parole
, we find
the spark of a very compelling reason to implement a risk instrument
and specifically a formal policy. On May 27, 1970 a federal prisoner
brought a "pro se"
action claiming that his due process rights had
been violated. The basis for this claim was that the board did not
state the reasons for denying parole. Later that complaint was
amended by his court appointed counsel to include all those in a
similar situation; thus all those that will become eligible for parole
were added. It thus became a class action suit.
The initial decision was rendered on October 1, 1973 and then
modified, upon the motion of the plaintiffs with no opposition, on
October 17, 1973. Essentially due process protections were
extended to the parole application process. The rationale was the
same as Morrissey v. Brewer because the stakes were the same.
Further incarceration (in this case a continuance) or conditional
freedom could result. This meant that the inmate had certain rights to
fairness because a loss of freedom could result. In the initial decision
this fairness included that the prisoner receive a written statement of
the reasons based upon the "salient facts or factors in each case,"
if
the application was not granted. The prisoners also had a right to
know what information was before the board and to have the
opportunity to refute it. The board was also to explain the criteria that
it used to reach a decision. The intent was to have "reasoned
decisions rather than arbitrary or capricious ones; and reasonable
assurance of reasonably reliable factual bases for decisions."
Later an appeal
was argued on June 5, 1974 and decided on
December 19, 1974. The U.S Parole Board had sought injunctive
relief and had argued against the jurisdiction of that court, the class
action status, whether it still applied to Childs, etc. In the original
order the board had 60 days to convey to the court the criteria used in
passing upon parole applications. The board did not argue against
that provision, of the criteria used to reach a decision. It had
submitted to the court the new Federal Parole Guidelines within the
60 days specified. Essentially the higher court decided the lower
court did have jurisdiction and affirmed the order of October 1, 1973.
Gottfredson related his recollection of how it evolved in the
following way.
He had a large data set of 100,000 parolees from
various states prior to his involvement with the Federal Parole Board.
Some individuals at Lockheed Information Laboratories had
developed a system called "dialogue" to search for verbal information
in abstracts. Gottfredson asked them if they could modify the system
so searches could involve data. They said, "yes" and then made the
small modification. The data set was then installed on the Lockheed
computer and this allowed access to the dialogue system.
Gottfredson then attended a meeting in Chicago and in one session they had a terminal and a phone line hookup with the computer at Lockheed. Individuals in the audience were then asked to propose some hypothesis regarding parole violations and different categories of offenders. The distribution of those offenders would then be called up on the terminal to test the hypothesis. For example all the car thieves of a certain age and their success and failure numbers. That session turned out to be much fun for the presenters and the audience. It also generated excitement regarding the possibilities. Two of the individuals who became very excited over the possibilities were from the U.S Parole Commission.
The spark of excitement, that was kindled in the members of the Federal Parole Commission, was the dream of computer technology to assist them in making fair decisions and to obtain better information upon which to make those decisions. They then talked to individuals at the National Institute for Law Enforcement and Criminal Justice to do just that. In view of Mr. Gottfredson's expertise in the area and their knowledge of him, he was invited to submit a proposal.
The research that followed was then part of study to improve decision-making by that board. The board, at that time, it should be noted, was also under criticism for not having a general policy. A small questionnaire was then developed for use by the board in their decision making. An analysis of that data showed that three items explained what the decision would be. At the first appearance, before the board, the items were the risk of parole violation and the severity of the offense. Institutional adjustment became important at later hearings. The results of that research showed that a policy did exist and that the structure corresponded to law. The law stated that they must take into account the severity of the offense and institutional adjustment. Risk was important to determine the possibilities of parole violation and failure, so they then went on to develop the risk score.
It is of note to mention that Walter Dunbar, who was a member
of the United States Parole Commission, kept on asking "can't we get
up on the computer the salient factor score of the case, to help us
make the decision."
The needed information name stuck and the
parole board thus actually named the instrument.
The nine items Salient Factor Score instrument was adopted
for use by the United States Board of Parole in October of 1973.
The instrument was part of a much larger project entitled "The
Utilization of Experience in Parole Decision Making." This effort went
forth under the combined research umbrella of the National Council
on Crime and Delinquency and the United States Board of Parole.
The funding came from a grant from the National Institute of Law
Enforcement and Criminal Justice, Law Enforcement Assistance
Administration
. As noted above, the board at that time was under
some legislative, public and judicial pressure to formulate its' policy at
that time. The adoption of the guidelines formulated by that effort
solved its' problems in that area.
The Salient Factor Score is actually a risk measure of parole failure. Previously it was determined that at an initial hearing the primary concern was the seriousness of the offense for which the individual was incarcerated and the likelihood of parole failure. The seriousness of the offense was known but the likelihood of failure was not really known quantitatively. A risk measure was thus developed, in order to answer that question.
Various criterion measures have been used in the instrument.
Initially in the 1970 construction sample, the definition of a favorable
outcome included the lack of certain characteristics during a two year
period.
A favorable outcome was defined as: no new conviction
resulting in a sentence of sixty days or more, no return to prison for a
technical violation and no outstanding absconder warrant. Later the
criterion was modified, to be stricter for the validation sample of 1972.
This measure included: no new commitment of sixty days or more, no
absconder warrant outstanding, no return to prison for a
parole/mandatory release violation, no death during the commission
of a criminal act. It should be noted also that death during a criminal
act was also listed for the 1970 sample, at this time. Thus in the
validation sample, of 1972, the primary change was a commitment,
for sixty days or more, rather than a conviction resulting in
commitment. The last revision, to the instrument, is the SFS of 1981. In this
major revision the favorable outcome was defined as:
1) No new criminal offense resulting in a commitment
of sixty days or more.
2) No return to prison for a parole or community
treatment center violation.
3) No parole violation warrant outstanding.
4) Not killed while committing a criminal offense.
Favorable outcome thus in all the measures has been defined as a lack of significant criminal activity, and supervision problems while on parole.
The method of drawing samples has been consistent in all revisions and construction samples and consists of using recurring numbers in the prison identification numbers. In the 1970 Construction Sample 902 cases were drawn from a 25% sample during the first six months of 1970. It was composed of persons released from federal prisons by parole, mandatory release or expiration of term. The 1970 Validation Sample consisted of 919 cases, which was an additional 25% sample as drawn above. An additional validation sample consisting of 662 cases (20% sample) was also drawn from the second six months of 1070.
The 1981 construction sample consisted of 3,955 cases of federal prisoners released to the community during 1970, 1971 and 1972. The 1981 Validation Sample consisted of 2,289 cases of federal prisoners released to the community during 1978. The instrument thus has been designed with robust samples and has been periodically modified.
The nine item Salient Factor Score (SFS73) was adopted for
use by the United States Board of Parole in October of 1973.
The
seven item SFS (SFS77) was adopted in April 1977.
The six item
SFS (SFS81) was adopted in August 31, 1981.
Various additions
and deletions of items have occurred along the way. Some items
have been changed, some eliminated and one item added.
The major changes to items have centered on the use of age and prior record. A field for 4 or more prior convictions adjudications was added in the 1977 version. For age a number of minor changes have occurred. The initial version, of 1973, merely differentiated between those below 18 years of age and above during their first commitment. The April 1977 instrument further expanded this differentiation to include: 26 or older, 18-25 and 17 or younger. The August 1981 instrument dealt more with the age at the current behavior and changed the break points to: 26 or older, 20-25 and 19 or younger. An exception was also noted, that if the individual had 5 or more prior commitments of more than 30 days the item was scored as 0.
A number of items have been dropped, from the versions, along the way. The initial version had an item concerning the release plan and it differentiated whether the plan was to live in a family structure with a spouse and or children or not. This item was dropped in 1977 version. The initial version also differentiated between those that completed the 12th grade or had a GED or did not. This item was dropped in the 1977 version. Employment was retained in the 1977 version but dropped in the 1981 version. The original version had an item which differentiated between those convicted of auto theft and those which did not. In 1977 this item was modified to include the forgery or larceny involving checks. In 1981 the auto/check item was completely dropped from the new scale.
The only item that has been added to the scale has concerned a measure of criminal history that reflects a recent lull in activity. In the 1981 scale an item was added that differentiated those that possessed a commitment free period, for essentially the last 3 years.
The original scale, of 1973, was composed of nine items. Along the way, a few items were changed, four items were dropped and one was added. This yielded the current six item scale used in the 1981 version of today.
Statistical Form And Variance Explained
The instrument is construction in a Burgess type scale. No
apologies were made for this veteran technique. Normally this would
be a simple additive type scale of dichotomous items. The
instrument however uses the simple additive method but some are
multifaceted. Hoffman notes
, in reference to the more sophisticated
techniques, that they generally produce a higher correlation on the
construction sample but more shrinkage occurs when they are
applied to the validation sample. He further noted that it is the
correlation of the validation sample that was most important.
Apparently also the level of prediction is not greatly improved by the
more sophisticated techniques. The variance explained has been
stated as either a point biserial correlation or Mean Cost Rating
(MCR). The initial version described both measures, the last revision
uses only MCR.
The Point Biserial Correlation is different from the normally
quoted Pearson's R, in that the maximum correlation is not 1.00. The
maximum value varies with the proportion of the success and failure
cases.
For the initial version
the point biserial correlation was listed
circa .280, with a maximum possible correlation of .75.
The Mean Cost Rating is a comparison method that has become popular in the comparison of the various instruments. It basically is an indication of the predictive efficiency over chance. The MCR for the construction sample was .36 on the construction sample and .32 on the combined validation sample. The SFS81 produced a MCR of .38 on the construction sample and .41 on the validation sample.
Don Gottfredson who developed the scale had an initial background in psychology before moving into corrections. He actually worked as a classification officer before moving into the research division. In the 1960's Don Gottfredson and Kelly Ballard developed tables for the California Department of Corrections which showed which prisoners would be paroled if consistent pass practices where applied. He has been a member of the Institute For The Study of Crime and Delinquency and the Research Center for Crime and Delinquency. After the development of the SFS he moved into the academic area and is now with Rutgers University.
Probation classification instruments developed quite recently, in the mid 1970's. They thus had the benefit of the previous research done in the parole field. Unlike parole, probation has a different perspective. Probation classification occurs after the in or out, of correctional custody, decision has been made. Probation thus can afford the luxury of not having to be exceptionally accurate. The parole decision essentially chooses between institutional and community corrections. Probation classification decisions differentiate between various supervision levels of community corrections. Probation had already been informally classifying offenders without significant criticisms. Prior to the risk instruments, supervision level assignments were made by either the time on probation, the officers' discretion or a combination of both.
Probation systems were rapidly expanding in the 1970's and so were the correctional populations. Along with this growth came management problems that previously never existed in community corrections. The probation systems of that time had extreme administrative needs not being met by the informal system it was using and it was under criticism to demonstrate its effectiveness. Probation at this time was selling the point that it was more cost effective then institutional corrections. Probation however was not prepared to deal with the lowest threat levels of offenders that previously had been received by the institutional correctional component. Parole on the other hand had been receiving the vast majority of the institutional spectrum. The underlying assumptions and starting points for probation and parole classification systems, within community corrections, were thus very different and they met different needs.
On October 21, 1977 the Comptroller General of the United
States submitted a report to Congress
concerning the need for
probation and parole to be better managed. The focus of this report
was that field contacts and rehabilitation efforts needed improvement
along with more efficient management of the district offices
Probation thus had different needs, pressures and a different
time frame. These forces did not react upon probation until quite
recently. Petersilia and Turner note
that, "classification instruments
began to influence probation field services in the mid-1970s ..."
The
major instrument to influence the probation field came from
Wisconsin. Citing a State of Wisconsin document,
they note that,
"the probation departments were in serious need of an appropriate
and systematic way to allocate their limited staff resources."
A LEAA
publication notes
concerning the Wisconsin Bureau of Probation and
Parole effort that:
Instigated by a legislative mandate calling for better methods of determining staff requirements and effectively utilizing all staff, the project was launched with special funding from the Wisconsin Council on Criminal Justice.
This is consistent with the popular belief in the probation field, that the Wisconsin state agency was in serious danger of budget cuts unless it could be demonstrated to the state legislature that something scientific occurs on a probation caseload.
These classification systems, that emerged, were almost all
patterned after the Wisconsin system, after the intervention of the
National Institute of Corrections. This cloning and mutational
replications were primarily as a result of the NIC Model Probation
System. The Wisconsin system was a total information package of
which the risk assessment was only a portion. The advantage of the
Wisconsin System was the totality of the management package and
this formed the impetus for the rapid spread. Wright and his
colleagues note:
The Wisconsin system has attracted national attention because it combines a number of elements into an overall model for administration of community supervision: risk-needs assessment, management information system, programmed supervision classification (Client Management Classification-Arling and Lerner, 1980), and workload accounting of caseloads. Recently, the National Institute of Corrections has recognized this approach as a 'model system' (Baird, 1981), and has undertaken a large project to implement this approach on a nationwide basis.
As Chris Baird noted
, regarding the Client Management
System.
It was all encompassing and it really became an operational system that touched every part of the agency... [speaking of the California BET and how it was researched based and not operationally based]... there was never anything to tie it to the rest of the organization. This Risk and Need Assessment [Wisconsin System] sort of became the basis for budgeting, it became the basis for staff allocation, it became the basis for the information system and because of that - that it touched every part of the organization - I think it just struck nerves with administrators. I remember the first time we went out with those work load reports. People looked at those - as if that was something completely foreign, just a great advancement - and it was a very, very simple concept.
Today, components of the system have been implemented in at lease 50 jurisdictions throughout the United States and Canada - agencies as diverse as New York City Probation, the Texas Adult Probation Commission, and the Wyoming Department of Probation and Parole.
The National Institute of Corrections has also received requests
for information concerning the Wisconsin Scale from such diverse
locations as Israel, the Philippines, New Zealand, Scandinavia and
England
. The appeal of the system thus exceeded one state, one
federal agency, many agencies and one country.
Part of the reason for the tremendous popularity and interest in the system was certainly managerial but also economic. At that point in time the probation departments were growing and receiving new resources of staff and material. The increases however did not keep pace with the increases in probationers, that was occurring. The probation administrators then sought a magical solution to the escalating caseload sizes. They also searched for a management system that would allow them to go beyond the total number of probationers on each officers caseload. A method that would allow them to see through the numbers. A method that would force officers to discharge (seek a termination of probation) of those cases that no longer needed probation. They also sought a method that would insure that many of the probationers would receive social casework services that had been the traditional role of probation in the past. Many agencies were in a crisis and could not justify their former probationer to probation officer ratios. The managerial system they had before thus began to experienced difficulty.
In the 1960's five to eight line officers might be supervised by one Supervising Probation Officer and the average caseload size might have been 50 mainly minor misdemeanor cases. The front line supervisor might thus have monitored the probation unit's performance for 250 to 400 cases. It was thus possible for administrative review of all of the officers and cases within his unit. As caseloads approached 80 however the unit now contained 400 to 640 cases and then with 100 cases per officer 500 to 800 cases, in the early phases. First line management, which was the Supervising Probation Officer, thus lost control of the process and any hope of review. Management, in the pre growth phase never really concerned itself with this day to day review phase. The systems were small and case intimacy was shared by the line officer and his supervisor. Suddenly officers did not know about their cases and the first line managers, the supervisors, had no hope of knowing what was happening in a line officer's caseload. All the managers did and could do, was to beg for more officers and supervisors.
The new influx of officers was also different. In the 1950's and 1960's officers were attuned to social casework. The probation population was also different. Felons, in the 1950's, mainly received time in a state prison rather than a local penitentiary. It was also unusual for a felon to receive probation.
In the 1970's probation was expanding, the probation population was changing and so were the officers. The social casework officers of the 1950's and 1960's however were now rising through the managerial ranks as the probation systems expanded under the relentless torrent of new probationers. They had matured in a social casework setting where good officers knew their cases, practiced social casework and were not afraid to visit probationers at home. Then they were social workers; no probation officer would ever want to carry a badge and a gun. Their former probation system was vanishing but their memories and dreams of the former systems had not. Gone were the minor misdemeanor offenders they supervised before, vanished were the safe streets and only some remembered the small caseloads. It was incremental, it was a slow change, it was a quiet revolution, that affected the system but not the management of the system.
Wisconsin was successful at keeping low probation officer to probationer ratios. Most other jurisdictions were not successful in that regard. Part of the success was the new system and part of it was the unique aspects of Wisconsin. All the current managers knew however, or rather all they cared to know, was that Wisconsin was successful; they had a management system they could use and it even forced social casework on these new different officers. Grant money was also available to implement the system. Probation managers did not stop to question a system with so many positive attributes. They did not hesitate to try their own implementation.
The Wisconsin System did not initially start to solve front line
management problems nor did it need to force money from the state
legislator. An analyst from the Wisconsin State Legislator had seen a
workload study, concerning probation, that had come out of
California. In California they had attempted to put some time frames
to various levels of supervision and they had thought this was a good
idea. When the Bureau of Probation and Parole, for Wisconsin,
came in and asked for more positions they froze those positions until
a workload analysis was done. Apparently the money and positions
would come but only when they did it the legislative way. Funding
was not a problem. All they wanted was some justification for the
officer to client ratios that existed at that time.
It should be noted that Wisconsin was different in some ways
from the rest of the way probation was developing in the more urban
states. Social Casework was firmly entrenched in a system that dealt
with criminal behavior that was noticeably less than the average in
America. The officers were not officers but called agents with a civil
service title of Social Worker I-III
. The average age of the agents, in
the test regions, was almost forty years old and they had been
employed by the agency for a little over a decade.
The majority of
the agents also had been formerly trained in the discipline of social
casework.
Wisconsin is also a rural state. Bohnsteadt notes that, in 1975,
whites accounted for 95% of the population and blacks 3%.
The
"...Wisconsin crime rate is considerably lower than the national
average."
Prior to the study all that was mandated in contact levels
was one face to face contact per month with the possibility the
supervisor of the unit of agents might impose more if the officer did
not.
Christopher Baird, who was one of the developers of the scale, had a masters level degree in economics. He was a research analyst and then an Associate Coordinator Of Planning and Evaluation for the Illinois Department of Corrections Research Division. He left that post to become the Research Director of the Case Classification/Staff Deployment Project of the agency. Brian Bemus joined the team as a Research Analyst just after completing a B.S in sociology. All total there was: a director, a research director, three analysts available, a secretary and a statistical clerk. Eight agents and a supervisor were also assigned and they devoted fifty percent of their time to the project. Apparently much of the success of the project, outside the creative efforts of the development team, was the foresight to have actual agents assigned to the project. Much of the input into the system development and reactions to it were facilitated by having the agents assigned to the team.
At that time the federal government
was funding block grants
and massive amounts of grant money were available, if approved by
a local authority. An application was made and approved but
progress toward a workload goal was not focused until the arrival of
Chris Baird. By that time demands were being made for results and
pressure was being exerted to produce results. Not only was the
pressure economic in nature, from the legislative mandate to produce
the justification before the money was released but now there was
some desire to have a system to allocate agents and allocate staff
according to a formula. Administrative pressure was also being
applied so administrative problems could be solved. At this time the
concept of risk was injected into the research. The Salient Factor
Score was now in use and risk had become the way people made
decisions.
In a substantial way the team that assembled the management system was different from the others that assembled instruments. The others were somehow insulated from the field and guarded in their touting of the system. The Wisconsin members were immersed in the field. Upon review of their work, one has to be impressed by the volume of material, the quality of the presentation and by the quality of individuals who assembled it. Certainly much of the rapid spread of the system has to be attributed to the personnel, that formulated it.
The initial case classification project operated over a four year
period. It began in 1975 "with special funding from the Wisconsin
Council on Criminal Justice."
In regard to funding, Bonstedt notes
LEAA funds started in fiscal year 1975-1976 in the amount of $110,400 on a 90-10% match basis. During fiscal year 1976-1977, funding was then increased to $135,361 through the same agency. Finally, for fiscal year 1977-1978, the final year or the grant funding was raised to $141,160. Grant funds were expended almost exclusively for personnel. For fiscal year 1978-1979, the Governor has signed approval for interim funding...the budget request is for $150,000.
Only one version of the Wisconsin Risk Instrument has been generated and only one criterion of failure has been used. The criterion measure consisted of: Rules Violations, Arrests, Misdemeanor Convictions, Absconsions, Felony convictions and convictions for Assaultive Offenses.
WEIGHT PER
FACTOR OCCURRENCE
_____________________________
Rules Violations . . . . . 1
Arrests . . . . . . . . . . 1
Misdemeanor Convictions . . 3
Absconsions . . . . . . . . 5
Felony Convictions. . . . . 7
Assaultive Convictions . . 9
3Criterion of Failure and Weights Added.Source: Project Report 2, pg. 2
These items were then weighted
in the following fashion as noted in the
accompanying chart. Each weight
occurrence was then added to a base
score of 1. Scores were cut off at 30
in order to prevent a few very high
scores from skewing the data.
As
noted, Rules violations and arrests
received the lowest weights, with
misdemeanor convictions and absconsions occupying the middle
range of scores. Felony convictions and assaultive offenses received
the highest weights.
The sample used to construct the instrument consisted of 250
closed or revoked cases drawn from the Madison area; where the
case classification and staff development project was located.
No
validation sample was drawn or used in the construction of the
instrument.
The method of development was multiple regression analysis, which yielded weighted items. The variance explained is listed as .58 but it must be cautioned that this figure, which would place it in the one of the highest ranges of predictive instruments, is based on the small construction sample. No measure of the R2 can be indicated for the validation sample, because of time constraints, none was used.
The Wisconsin instrument is divided into two major sections, that measure two very different areas. These items correspond to criminal history items and stability items. If we examine the stability items' first we see indications of: residence stability, employment stability, no substance abuse and being receptive to assistance.
These items are scored in the following way
Number of Address Changes 0 None
in Last 12 months 2 One
(Prior to incarceration) 3 Two or more
Percentage of Time Employed 0 60% or more
in Last 12 months 1 40% - 59%
(Prior to incarceration) 2 under 40%
0 Not applicable
Alcohol Usage/Problems 0 No apparent problems
2 Moderate problems
4 Serious problems
Other Drug Usage/Problems 0 No apparent problems
1 Moderate problems
2 Serious problems
Attitude 0 Motivated to change
receptive to assistance
3 Dependent or unwilling
to accept responsibility
5 Rationalizes behavior;
negative, not motivated
to change
The criminal history items correspond to indications of when the criminal behavior started (age at first conviction) and indications of the severity of the criminal record. Such items as the number of prior periods of probation/parole and the number of prior Felony Convictions was included. It also includes indications of the special classes of offenses that were of concern: convictions or adjudications for Burglary, Theft, Auto Theft or Robbery, Worthless Checks or Forgery. A special indicator was also added for political reasons in the instrument. Any assaultive offense with a weapon was included not because it was a predictor but because in the scoring it would insure that such probationers would at least be classified as intensive cases for the first six months.
These items are scored in the following way
Age at First Conviction 0 24 or older
(Or Juvenile Adjudication) 2 20 23
4 19 or younger
Number of Prior Periods
Probation/Parole Supervision 0 None
(Adult or Juvenile) 4 One or more
Number of Prior Felony
Convictions (or Juvenile) 0 None
4 One or more
Convictions or Juvenile
Adjudications for 2 Burglary, theft
(Do Not exceed a total of auto theft, robbery
of 5. Include current 3 Worthless checks or forgery
offense)
Conviction or Adjudication 15 Yes
for Assaultive Offense. 0 No
An offense which involves
the use of a weapon, physical
force or the threat of force
In my judgment, the instrument still has no equals and probably will never be surpassed regarding the impact it had upon probation.
The New York State Risk Instrument was selected for this analysis because it is one of the newer probation instruments that was actually validated and tested. It also demonstrates the current functioning of risk assessment in a major industrial, and populous state.

Figure 2 NYS Probation Caseload Growth vs state prisoners since 1925
Source: The probation data was obtained from NYS DOPCA. The corrections data was obtained from Historical Statistics of prisoners U.S Department of Justice 1988.
The New
York State
instrument was
developed when
the institutional
and community
correctional
populations
were increasing
at an alarming
rate. After a
stable period,
from the mid-1960's until the mid-1970's, the number of probationers
skyrocketed. The accompanying chart demonstrates this incredible
growth.
Originally a RAF (Risk Assessment Form) was borrowed from Wisconsin, without modification, along with the Wisconsin "Assessment of Probation Needs and Strength" form. Later a new RAF was developed but the original need's form was retained. The new instrument had been the cornerstone upon which ISP (Intensive Supervision) is based in New York State and by implication it has validated a type of supervision as successful for so called high risk probationers. It is also the cornerstone of the so called Wisconsin Plan, in New York State, in which caseload numbers are not important in setting individual officer's workloads but the mix of case types as determined by various risk scores.
The New York State Risk Assessment Instrument was developed in order to implement an Intensive Supervision Program (ISP) developed for the state. The hope of this program was to divert some of the prison bound individuals into an intensive supervision probation caseload. Probation in New York State however is administered through individual county units of government. Although a state take over of the individual county systems into one state system has been discussed, proposed and various bills to that effect have been proposed, the concept has never happened. The State Division of Probation however does monitor the operation of the individual county departments because it funds a significant portion of their operation. Thus it does promulgate rules and monitors compliance.
This program was developed by the New York State Division
of Probation but the implementation of the program was to be through
individual county probation departments that were monitored by the
New York State Division of Probation (DOP).
Funding for the
officers involved in the program was significantly higher than the
normal state reimbursement rate. Normally this fluctuates at the 50%
level with a variation of plus or minus 5%. It should be noted that the
rate also excludes some department budget items, thus in actuality
the rate is to approximately 30%. The reimbursement rate for the
officers in this program was to be 100%. The problem was that high
risk probationers needed to be selected for the program. The
concept of high risk however is different throughout the various
counties, thus selection of high risk probationers was to be through
an impartial instrument.
the first instrument introduced by the NYS Division of Probation to be used by almost all of the local counties in State; most counties began screening in December 1978, with the remaining counties operationalizing the program during January 1979.
The development of the instrument followed a survey of other
instruments in use at the time. No existing instruments were selected
for use in New York State because of validity doubts existing in the
other instruments. In this regard the developers specifically noted:
There was a decision made that if doubts as to the validity of the instruments had been a major problem throughout the country, it was likely that there would be similar doubts in New York State
The instrument was developed by a Senior Probation
Program Analyst and five student interns. As noted previously,
because of validity doubts, New York State found it necessary to
develop its own instrument. Apparently during the initial phase, no
public or professional input was sought regarding the meaning of risk.
Risk was simply defined as probation failure. Specifically it was
noted that, "rather than defining risk in some unmeasurable way,
such as the potential for harm to society or the chance for committing
a serious offense, the decision was made to define risk as the
potential for failure on probation."
Failure was then defined as, reconviction for a misdemeanor
or felony while on probation, a revocation of the probation sentence
by the court, absconding, and receiving an unsatisfactory discharge.
Another criterion of being incarcerated while on probation was initially
proposed but it was later deleted.
While it is beyond the scope of this review to fully explore how the variables for the instrument were selected, a short review with an emphasis on the age variable will be accomplished. A total of 42 variables were analyzed prior to building the instrument. No mention was made of how or why certain variables were selected, however it appears that they basically included items transported from previous instruments.
The age variable in the New York State Instrument was tested in a limited fashion. It was tested for the age of the subject at offense, age at first arrest, and age at first conviction/adjudication with the following categories 19 or under, 20-23, 24-29, 30 or over, only recorded. Age as noted previously, has usually been considered an important variable for judging future performance in the Criminal justice system. This importance has normally been in regard to the age of first adjudication in the juvenile courts and of course the widely held view that antisocial behavior declines with the onset of middle age. It thus is surprising why such a valuable variable would have been structured to group all teen-agers together and all men from 30 years old and above in another group. The preconceived idea possibly could have been that only some difference between teen-agers, adults and some variation in the 20 year old group may exist. There is however no way at this point to determine specifically why it was done. Only the results that yielded if someone is 19 or under at the age of first adjudication or conviction is an increased risk are apparent now in the instrument.
The 42 variables were then subjected to the crosstab program of SPSS and 18 variables were then selected.
The 18 variables were then subjected to the multiple
regression program of SPSS and it was found that 10 of the 18
variables accounted for 32% of the variance. The team however,
stated that this really was not satisfactory because they were not
interested in predictions for the total sample of 353 cases but only the
"high risk" offenders. They thus stated that "while the multiple
regression analysis did produce a total regression value that was
considered to be high, it was felt that discriminate analysis could be
of further help..."
The EIGENVALUE program within the discriminate
analysis program of SPSS was then run. This use of Discriminate
Analysis upon the variables resulted in a selection of a final 10. The
team then noted, "The eigenvalue obtained, which was .72, can be
considered to be a better indicator than the final multiple regression
value (.32), which was obtained in the earlier analysis."
The development of the instrument then occurred as follows.
The failure definition was then tested in an eight county survey which,
by determining probation success and failures (N=500) by risk score,
validated the instrument to their satisfaction.
Then the instrument
was again validated to their satisfaction for use with misdemeanors
(N=402), in the same manner.
The combined results (N=902) then
deemed the instrument suitable for both misdemeanant and felon
probationers, and it was so pressed into service. Later the risk
instrument was changed and two items were dropped
with resulting
weight changes for the remaining items; the instrument was,
however, not re-evaluated as before. In the last phase of its
development the instrument, by implication, came to show that
Alternate Sentencing was feasible; for another study
had shown that
probationers and prison inmates had not differed markedly. The ISP
(Intensive Supervision Program) had already demonstrated that they
were able to work quite well with labeled high risk offenders in the
community.
The instrument is divided into criminal history and stability items. The criminal history items are: arrests within five years, prior robbery convictions, incarcerated on a previous probation or parole sentence and the age at which anti-social behavior began. The social stability items express a concern for: not being in school or a work setting, not having a stable residence, not living in a favorable situation, or having an unsatisfactory attitude.
The items are scored in the following way.
Points
Arrested within (5) years of the current offense 6
Nineteen or under at time of the first conviction/adjudication 12
Prior conviction/adjudication for robbery 20
Incarcerated while on a prior probation sentence or parole 24
Neither employed nor in school full time 4
One or more address changes in the year prior to current offense 6
Currently living in a situation judged to be unfavorable 8
Has an attitude that is either one in which he rationalizes
his behavior; or he is negative and not motivated to change;
or he is dependent or unwilling to accept responsibility. 19
Some direct similarity is apparent between the New York State Instrument and the Wisconsin scale. For example the Attitude item is a direct copy of the Wisconsin item, the address stability is based on the same time interval and the age of first conviction uses the age nineteen cutoff point. The instrument differs in regard to the amount of points awarded to subjective items that are interpreted by the scoring officer. For example, the attitude item and the living in a situation judged to be unfavorable account for 27% of the score. The stability items account for a total of 37% and the criminal history items for 62% of the score.
After the inception of the program, an unofficial
belief developed concerning the ISP (Intensive Supervision
Program). Essentially the view was that the statistical numbers
concerning ISP and other probationers did not demonstrate any
significant difference. No public mention however was made of the
instrument's effectiveness or non effectiveness. In the Fall of 1985
however, the instrument did come under some professional criticism
concerning its ability to discriminate among probationers.
At that time it was believed that the only reason the program continued to exist, was that it allowed special one hundred percent funding for ISP officers in the New York City Probation System. New York City, of course, had the largest numbers of ISP officers in the state. The New York City Probation System, at that time, was also suffering from massive neglect and a lack of focus. The one hundred percent funding of ISP officers in that location represented a considerable sum of money. ISP was retained and it was unofficially alleged that misdemeanor probationers were allowed in the program as a compromise to upstate jurisdictions.
At this time, it must be remembered that one reason the New York State Risk Instrument was developed was to select high risk felony probationers for the new ISP program. It was thus an essential feature of the selection of high risk offenders. Slowly however the State of New York backed away from its risk prediction instrument and the implication that the instrument identified High Risk Felony Offender.
On June 26, 1986 a memorandum
was issued by the New
York State Division of Probation and Correctional Alternatives
regarding a program modification to the ISP program. These
changes were in response to legislative and executive initiatives. The
program would now become involved with felony cases prior to
sentencing and after sentencing. These new cases would be
considered ASP (Alternate Sentencing Program) cases and the
selection of offenders into the program would not be dependent upon
the risk instrument. The new case designation would meet judicial
and program characteristics that were independent of the risk
instrument. These new cases were also to have supervision
standards that were tighter than the traditional ISP cases.
ISP standards called for four face to face contacts per month, four collateral contacts per month and one home visit per month. The new ASP cases required eight face to face contacts per month and two home visits per month. Collateral contacts remained the same as ISP cases.
Unofficially the New York City Probation Department had
agreed to supervise some cases prior to formal sentencing by the
courts.
This practice continued and its use, by the judicial system,
started to rapidly expand. At that point the practice could not be
stopped. The new ASP program incorporated this feature. Cases
were supervised prior to sentencing, special enhanced presentence
investigations were completed and they were sentenced to probation
with a special condition of ASP supervision. It was never clear
whether the practice was incorporated and formalized because it
could not be stopped or because it was a good idea.
Still later it was planned that ISP/ASP caseloads would only
receive special funding if 40% of the cases for the department were
of the ASP variety.
In March of 1990, the state director notified
Probation Directors
that the program would be changing and that
these changes would become effective on January 1, 1991.
Administratively Assigned Cases, those selected with the risk
instrument, were to be phased off ISP caseloads by 10/1/90. On
4/1/91 funding would only be based on the Alternatively Sentenced
Cases (ASP) and no funding would be granted for the
Administratively Assigned Cases (ISP). Those cases designations
were determined by the risk instrument scores.
Thus in a relatively brief period, the instrument went from being considered a good tool to identify risk, to a method of identifying high risk felony and misdemeanor offenders. It then made the transition to a method of administrative selection, to no use at all in identifying cases for special state emphasis.
HOW THE MAJOR INSTRUMENTS COMPARE
In the previous chapter the two major instruments that had a major effect on the field were examined. On the probation side of community corrections the Wisconsin Risk Instrument has inspired many other instruments both as a direct copy and illustratively as a starting model. The Salient Factor Score is the premiere instrument of the parole systems. It was the first instrument developed and it set the example for all others to follow. The Salient Factor Score also defined for community corrections what its major concerns should be. The instrument defined risk and thereby stated to the society at large that community corrections was concerned with offender risk to the community.
The New York State Risk Instrument will also be examined in this chapter. It is a more recent instrument that was validated on a major industrial state sample. It is also a more recent instrument. It thus had the benefit of possibly avoiding all the shortcomings and oversights of previous instruments. As mentioned in the previous chapter, New York State chose to develop its' own instrument because of validity concerns with the other instruments that were in use at the time.
A comparison will now be made between the instruments, to demonstrate how they are similar and different in design. We will also examine the instruments to determine if they do what they purport to do, determine risk of failure.
All the instruments use a multiple criteria of failure. Such multiple measures usually include: new convictions, technical violations and outstanding warrants. There are a number of advantages to using such broad measures of failure. The obvious one is that all negative behavior is accounted for. When this happens however, the focus is lost. Is someone who has community supervision revoked because of a technical violation, such as failing to report as required, really the same threat level to the community as someone who has been convicted of committing another felony in the community? In all fairness to the Salient Factor Score, that instrument does limit the meaning of technical violation in that it has to be serious enough to result in parole being revoked.
All three instruments also list absconders, someone that has fled the jurisdiction or cannot be found, as a failure. That poses a question: is someone who cannot be found a threat to the community? If they are involved in another offense, then they would be counted in the commission of a new crime category. Absconders thus are more of a threat to the credibility of the agency, than a threat to the community.
The probation instruments generally have the lowest definition of failure in the selection of threat to the community items. The New York State instrument probably uses the loosest category in its criterion, that of receiving an unsatisfactory discharge. An unsatisfactory discharge is usually submitted to the sentencing court because the agency cannot work with the offender in the community. The result of this discharge however is that they are removed from the probation supervision system but they still remain at liberty in the community. They therefore cannot be a threat to the community because they are allowed to remain in the community without supervision. The Wisconsin instrument also includes "rules violations" which does not even imply a revocation by a court.
There is another reason why such wide band measures of failure are used. It is very difficult to obtain high measures of variance explained with a very low base rate of behavior. By broadening the definition of failure, the base rate is made larger and by doing nothing else, the level of variance explained will increase. The danger in doing this however is that a watered down definition of failure results and the instrument now measures individuals who fail and "sort of" fail. Failure is thus no longer specifically defined as protection of the community.
All three of the instruments are structured around a heavy use of criminal history items. The Salient Factor Score, in its current incarnation, is almost entirely composed of criminal history items. The only exception being the history of heroin/opiate dependence.
The major difference among the three instruments is how risk is defined. The simple choice is to say that person X has some positive attributes that will prevent failure or the person has some negative attributes that will insure failure. The probation system instruments are structured in negative attribute terms and the parole instrument in positive terms.
All three instruments use criminal history in various ways. None of the instruments are sophisticated in this measure. They take samples of the criminal history but never see it. The use is akin to boring test holes in a field and trying to determine the geology of the landscape. Depending on the number of test holes and their depth, the geology can be inferred but never seen in its' entirety. All three measures try to determine if the criminal behavior started at a young age. Only the New York State instrument uses but one cutoff point. The other two instruments use a range of values. All the instruments try to judge the extent of the record by accounting for the number of prior offenses. Only the New York State Instrument and the Salient Factor Score try to account for a crime-free period. As noted previously, only the Salient Factor Score extensively uses criminal history. Probation instruments, in general, are evenly split between criminal history and stability items.
A Visual Comparison Using Charts
When statisticians examine data they are usually examining it through mathematical relationships and the results are usually some mathematical indices. The indices usually refer to different aspects of the data. If we were to follow the same method, in this section, we would all have to have the same knowledge of the measures and it would take many measures. This writer would also have to have either the results of all the indices for each instrument or the raw data in order to produce the indices. Uniform indices are not available and neither are the raw data. A different method will thus have to be used.
We will thus use information listed from professional journals and internal reports. These data have been transformed to yield similar ranges and to reveal the number of failure cases. Usually the results are listed in terms of the percentage of cases that succeed at various risk scores or the percentage of failures in each score. Once the number of individuals is known in each score level, then it is possible to compute from the percentages the number of success and failures in each score.
Charts have been used for ages to supply meaning to the relationships of data. They also have the ability to convey massive amounts of information at a glance. No special knowledge of statistics is also necessary with this method and no agreement concerning the indices to use is necessary. Since we are to examine trends and the relationships are not that complex simple graphs will do nicely.
The data must possess a certain relationship to be
meaningful.
If we were to
examine a sample
of 5,500 offenders
who were scored on
a perfect instrument
the results would have a certain form. If we assumed that we had an
equal number of success and failures, it would have a certain pattern.
If we examine the accompanying chart, we immediately see such a
pattern. Success cases diminish as the score increases and failure
cases increase with the relative score. Both measures, success and
failure are true straight lines, because they are theoretically perfect
relationships on our theoretical instrument. All the lines also go to the
corners, because we have a theoretically perfect instrument.
In the real world we rarely have such perfect relationships. We also, in the field of community supervision, do not have an equal number of success and failure cases. Failures are usually much less than the success cases. Failure in the instruments we will examine will be approximately 10-25%, of the total sample, in the probation instruments and somewhat higher in the parole instrument.
As previously mentioned, the base rate of the occurrence
also effects the maximum correlation attainable and it will be
noticeably
apparent also on
our charts. At the
thirty-three percent
level of failure, the
line representing
failure has fallen
over to the base of
the chart but the
line representing
success has remained in the same location.
This graph would
represent the higher boundaries of a parole sample failure rate. The
result of the relationship between failure and the risk score is still
linear but the relationship between the two measures is not in
agreement.
If we now examine a perfect relationship with a twenty-five percent failure rate, we notice the line has still fallen farther still. This is a lower level then a parole base rate and the New York State Instrument rate, but higher then the Wisconsin Risk Instrument. The Salient Factor Score had failure rates for the various samples of approximately 25-30%.
If we examine the rate at the nine percent level, we see that the line is almost approaching the base of the chart. At lower levels when it is virtually at the base of the chart, we can see that the majority of variance explained is due to the representation of the
success cases, not the failure cases. The Wisconsin Instrument has a failure rate of approximately eleven percent.
The New York State Instrument is a good place to start. It is one of the more recent instruments and it also represents an instrument from an urban, industrialized state. It should be noted that the data is from validation studies before two items were deleted from the instrument; because they allegedly did not provide all that much predictive power. A validation study however was not completed, after the change, so their assumption had to be that the relationships remained fairly constant. We have no choice, but to make the same assumption, that the distribution would be similar today.
Each chart is proportioned so success and failure are at the same ratio. The charts are arranged with the risk score at the very bottom. An ideal relationship in the data would have a characteristic pattern. As the risk score increased we would expect the number of probation successes to decrease. As the risk score rose we would expect the number of failure cases to rise.
Now let us
look at the real
world, with a chart
of felony success
by risk scores.
Figure 7 The New York State Instrument Felony Success Cases by Risk Score Source: ISP Evaluators Report #1, 1979
The data for the
following two
charts are the
basis for the
validation

of the initial risk instrument, which formed the selection basis for the ISP program in the state.
In this first presentation we will separate out success and failures separately in bar charts. We will also use the maximum amount of data available so every nuance can be observed.
Success, as expected is concentrated in the lower scores at the left side of the chart with a downward slope with every increase in risk score. The point of highest success is, of course, noticeable as occurring at the far left of the graph with a score of 4. A great deal of variation however, is noticed in the slope. We would thus say that increases in the score do not uniformly lead to a reduction of probation success. An example of this may be noted by observing that more cases succeed at a score of 38, then at a score or 6. Another example would be that more cases succeed at a score of 22 then at a score of 10. The trend however is downward, with a decrease of risk score. Noticeable however, after the downward trend, is an increase of probation successes that forms a bulge between the scores of 48 and 64.

Figure 8 The New York State Instrument
Felony Failure Cases by Risk Score
Source: ISP Evaluators Report #1, 1979
If we now observe the graph of probation felony failure cases by risk we would expect a mirror image of the probation success graph. At first however, we are struck by the fact that failure cases are not confined to the right side of the graph. We also observe that more cases fail below a score of 50 then above it. We also note that no uniform increase in the number of probation failures is observed with increased risk score. The general slight upward trend only occurs to a score of 42; which is the point of most failure. The number of probation failures then declines until a score of 54 and it then increases and declines again. It is thus very difficult to determine at what point the highest risk of probation failure occurs.

Figure 9 The New York State Instrument
Felony Success and Failure by Risk Score
Source: ISP Evaluators Report #1, 1979
Let us now examine that information in the degree of resolution that is available for the other instruments and by combining both the felony failure and success cases together. In the accompanying chart we see that success, resembles a gentle downward sloping curve, that almost intersects with the base of the chart at a score of 80. Failure rises to a value of 30 and then slightly decreases to a score of 40. It then rises again to a score of 50 but then declines to 60 and then rises again to 70. It then declines again from that point. While the success cases follow an expected pattern, the same cannot be said of the failure cases. Missing is a uniform rise of failure cases. It is also difficult to determine when failure is related to the risk score. For example, fewer failures occur at a score of eighty, then at twenty.

Figure 10 The New York State Instrument
Misdemeanor Success and Failure by Risk Score
Source: ISP Evaluators Report #2, 1979
The
misdemeanor
chart
essentially
elicits the same
type of information
as the previous
felony chart.
There still is a
tendency for the
number of success
cases to decrease with increases in the risk score. This decrease
however, is not as uniform as before. There now is a rapid decent to
the score of 50 and then the line proceeds along the base of the
graph. Failure however is not as evenly distributed along the bottom
of the graph. Failure has now become more concentrated slightly left
of the center in the graph, between a score of 40 and 60. After a
score of fifty, the rate of failure actually declines. As before no
uniform increase of failure can be observed, as presupposed by the
previous theoretical models.

Figure 11 The New York State Instrument
Felony and Misdemeanor Success and Failure by Risk Score
Source: ISP Evaluators Report #2, 1979
If we now examine the combined results of both misdemeanor and felony success and failure by risk score, we again see similar results. The number of success cases rapidly declines until the score of 50. The rate of decline is less from 50 to 70 and then we virtually run out of success cases. The mound of failure cases, that was apparent in the misdemeanor chart, has been flattened. Failure now increases to a score of 50 and then continues to decline after that point.
Again, while there is a tendency for probation success to decline with increases in the score, no such tendency for probation failures to increase with increases in the score is evident, except to a score of 50.
After viewing all the distributions it thus appears that although the instrument was designed to measure the risk of probation failure, it really utilizes measures that track the reduction of probation success and not measurements that indicate the risk of probation failure. If it was measuring risk of failure then we would have observed increases in the number of failure cases with every increase in risk score. This would have occurred because individuals with more characteristics would have failed at higher rates. This did not occur and one cannot really say, in the general sense, more failures occur because of the inclusion of the characteristics.
Why does the instrument appear to work then? If you look at
the results of the felony validation study
we see impressive Chi-square results (level of confidence + .999+) that tell us that the
distributions by risk score for success and failure are very different
and in all probability could not have occurred by chance that way.
Looking back at the bar charts we would also come to that
conclusion. If we examine the listed mean, median, and standard
deviation scores for both success and failure, we would also see
numbers that said the distributions were very different. Looking back
at the bar charts again we would also agree. If we examine the chart
included in the evaluators' report that lists success and failure by
three core groupings we see that within a range of scores from 64 to
100, the composition is only 8.1% of success cases and 91.9% of
failure cases. If we look at the portion of the chart that lists scores
from 0 to 26, we see that within that range the composition is 90.4%
success cases and 9.6% failures. The implication then is that failure
increases with risk score, In reality though, as we have seen, failure
remains almost akin to a constant throughout the range. Thus, in the
lower scores more success cases are apparent and the relative
percentage of failures is less. In the higher ranges there simply are
less success cases and the relative number of failure cases is
increased. The percentage numbers therefore imply an ability of the
instrument to identify cases that are likely to fail, but this ability simply
does not exist. The projected ability is a function of the presentation
not a representation of reality.
This kind of relationship can give the appearance that the instrument is really working, when in fact, it is not. It thus may appear to be identifying cases that are likely to fail, when it is only identifying an individuals relative position in a group of success cases. This may be illustrated if we use the example of Officer X and his impressions of the instrument. Officer X completes and initial assessment of the felony probationer Y and records a score of 50. At this level the numbers that succeed and fail are about equal. We thus could assume that the rate of failure and success is 50/50. If probationer Y succeeds, then Officer X pats himself on the back, but if he fails the officer has confirmed that he is a case with a high risk of probation failure. Over time officer X will notice that many people tend to fail within the high range and he will come to believe that there is something to the score.
On the other side of the coin, Officer X assesses probationer Q and he observes a score of 4. If we refer again to the bar charts, we see that in the previous validation study 41 cases succeeded and 2 failed at that score. The ratio is thus approximately 20 successes to 1 failure. Over time, Officer X will tend to see many probationers succeed within that score and over time Officer X will come to believe that there is something to the score. This impression however will be based strictly on the relative numbers of cases he sees, in each score range, rather then the instrument's ability to identify those factors in the probationer's data base that will indicate he or she is likely to fail on probation.
A listing of success and failure by risk score is not easily
obtainable, for the Wisconsin Instrument. After almost believing that
no listing was made in the literature and internal documents, a small
listing was found in a two year follow up report
, that was obtained.
This listing however is not without problems; for the purposes of this
analysis. The score intervals are not equal. The reason for this
inequality was expressed in a note attached to the Initial Risk Score.
It reads
Scores were aggregated (for this presentation) to the point where an additional increment in risk scores was accompanied by a significant increase in the revocation rate. The 15 points assigned to assaultive offenders were not included in risk scale computations because this item is not predictive of continued criminal activity.
UNITS SCORE NUMBER SUCCESS REVOKED RATE
(4) 0-3 543 538 5 0.92%
(4) 4-7 1,124 1096 28 2.49%
(2) 8-9 492 464 28 5.69%
(2) 10-11 387 349 38 9.82%
(3) 12-14 432 378 54 12.50%
(5) 15-1 498 420 78 15.66%
(5) 20-4 362 268 94 25.97%
(5) 25-2 252 58 94 37.30%
(6) 30 141 81 60 42.55%
4 Successes and failures by score interval in the Wisconsin Instrument.Source: The Two Year Follow-Up Report
The
accompanying chart
reveals
that the
revocation rate does
indeed rise quite
nicely with the
intervals as
presented. The
arrangement of the data however will not suffice for our purposes.
We need to determine how failure is presented by the risk score.

Figure 12 The Wisconsin Instrument with the original data.
Source: Two Year Follow-Up Report, 1979:10.
The original data did not include a listing, along the left side of the chart, of the units covered by each of the listed intervals. The risk score intervals are thus very unequal. The chart reveals that two of the intervals contain four units each. Two intervals contain two units of scores each. One interval contains three units. Three intervals contain five units each and the last interval contains at least 6 units.

Figure 13 The Wisconsin Instrument
As modified through the transformation of the data in the Two Year Follow-Up Report
Source: Two Year Follow-Up Report, 1979:10.
While this scheme does highlight the increased revocation rate within the scores, it distorts the visual graph for our purposes, as noted in the accompanying chart. The success portion is heavily distorted with peaks and valleys, but the failure portion is uniform and indicates a very gentle rise with risk scores. As noted earlier however, this is not a true representation of the data.
Most of the intervals contain five score intervals. If we transform the data to approximate five interval groups another picture emerges. This however is an approximation because we do not have the actual data. In the transformation process the overall success cases have decreased by 12% and the revoked cases by 5%. The rate of failure has only changed however by 0.74%. It is now 12.06% failure.
In the chart (Figure 13), we now observe that the success rate has been evened out and a nice gentle curve has resulted. Success cases are reduced with increases in the score. The failure cases however take on a different form. They are relatively stable until a score of 9 and then they rise until a score of 14. They then remain relatively stable from a score of 14 until a score of 29. A small drop then occurs after that point.
Can we speculate concerning the instrument? A cautious speculation perhaps because the data is very limited and transformed. Without the light of further data, the assumption must be that the Wisconsin Instrument has very similar characteristics to the New York State Instrument. It apparently appears to work because the level of success cases drops with each score, not because the level of failure cases rises with each score. It does not appear to differentiate among the failure cases; those that are a threat to the community and the probation system.

Figure 14 The Wisconsin Instrument as applied to New York City Probationers Source: Clear and Dickson 1984:114
In 1984 Wright, Clear and Dickson reported on an application of the Wisconsin
Instrument to the New York City Probation Department population.
That
information is enclosed in the accompanying chart. They note that from the data alone, "It is difficult to identify positions in the distribution
where significant jumps in the failure rate occurs."
This is very
evident in the accompanying chart, generated for this thesis. We see
failure peaks in the 10-14 range and the 20-24 range. An extreme
drop, in failure cases, also occurs in the 15-18 range. The authors of
the application, in New York City, also note
that those
who receive low scores in the original instrument (10 or less) do well on probation and have a low failure rate. However, the ability of the model to distinguish success and failures tends to break down after this point.
This is also evident in our visual chart. The instrument indicates a rise in failure cases until the interval 10-14. The rate of failure then decreases until the 15-19 point. It then rises again. The instrument thus then poorly differentiated among Wisconsin probationers and this tendency increased when it was applied in New York City.
The Salient Factor Score (SFS) is the premiere instrument in the parole field. It was the first instrument to be a large scale success. It however has not rested on its' laurels. As noted in the previous chapter, it has gone through two revisions.

Figure 15 The Salient Factor Score 1970 Construction Sample
Source: Hoffman and Beck 1976:71.
The 1970 construction sample was composed of a robust
nine hundred cases and this tendency to use large samples has
continued throughout the versions of the instrument. As noted
previously the Salient Factor Score is also different from the two
probation
instruments
reviewed. It uses
positive items
and has a heavy
reliance upon
criminal history
items. This
however has not
worked to its'
detriment; as
noted in the accompanying chart. This chart (Figure 15) represents
the 1970 construction sample that was first used to design the
instrument.
A characteristic of the SFS graphs will be the tracking of
the success cases with the failure cases. This characteristic is
apparent is all the graphs.
As noted above, failure does indeed rise with the lessening of the score. The SFS is also very different from the probation instruments, in that a lower score implies higher risk. If someone has a positive attribute then they receive points. This is directly opposite from the probation scales reviewed. In those scales points are added with negative attributes. As noted in the chart above, failure increases until a score of 4. It then declines moderately until a score of 3 and then it plummets to a score of 1 and 0.
Cases that succeed also track the failure cases in an orderly fashion. There is almost a parallel tracking of both types of cases until a score of 4, then almost uniformly the degree of parallelism collapses into almost a uniform run downward.

Figure 16 The Salient Factor Score 1970 Validation Sample
Source: Hoffman and Beck 1976:71
This pattern is also expressed in the accompanying chart (Figure 16)
of the 1970 Validation Sample. This sample represents an even
more robust
1,581 cases.
As noted
above, the
number of
cases that
succeed also
rises with a
diminishing of
scores. The
instrument thus
is not judging
cases that succeed very well. It is however judging cases likely to fail
well until the diminished score of 4 is reached. At that point the
numbers of failures plateaus and then again fall off.

Figure 17 The Salient Factor Score 1972 Validation.
Source: Hoffman and Beck 1976:71.
In 1972 the instrument was again validated. In this
accompanying table
(Figure 17) we can again see the characteristic
tracking of success and failure cases. In this example however, there
is a noticeable drop of success cases between the values of 8 and 5.
Also noticeable again is that success cases are increasing with a
diminishing of the score. The instrument however is not designed to
determine successes but failures and the instrument is showing a definite rise of failure cases with each lowering of the score until a
value of 3. At that point the numbers of both the success and failure cases
rapidly diminish.

Figure 18 The Salient Factor Score Sample II of the 1981 Revision.
Source: Hoffman and Beck 1983: 541.
In the last revision of the instrument we see dramatically
different representation of the data.
As noted in the previous
chapter, sample I was composed of the combined early samples and
Sample II was a later sample. In this example (Figure 18) there is
much distortion in the success sample but a nice slope of increase in
the failure group until a score of 2.
In Sample II (Figure 19) the success sample is taking on the form of the previous

Figure 19 The Salient Factor Score Sample I of the 1981 Revision
Source: Hoffman and Beck 1983: 541.
probation instruments with a decrease of success cases with the corresponding decrease in score. As noted previously, the probation instruments registered increased risk with an increase in score. There is also a noticeable flattening of the representation of the failure cases in this sample.
The last
view of the
Salient Factor
Score concerns
the five year
follow up study of failure cases that was performed.
In this view
(Figure 20) the larger number of success cases is apparent. It is also
apparent that the number of failure cases has been reduced. This
has led to a overall flattening of the failure cases but a noticeable rise
in the level of failure cases still occurs until a score of 5. There is a
small dip after the score of 5 but then an increase from the score of 4
to 2. After the
score of 2 the
level then
decreases
rapidly from that
point.

Figure 20 The Salient Factor Score Five Year Follow-Up Study
Source: Hoffman and Beck 1985: 505.
Of the three instruments reviewed it is apparent that only the Salient Factor Score has tracked failure cases in any consistent order.
There are many paths to solving problems and there are many problems that need solutions. Once the problems have been established and priorities set, then plans of action are formulated. Policy is a strategy for action. It is a plan for action, a way of dealing with social problems or issues.
Policy is usually in place in a governmental agency but it is
not always stated. Policy is not static. It can be altered by
philosophical beliefs, external changes, economic changes, physical
changes in the environment, legislative changes, legal reappraisals
(case law), budgetary changes, new theories, leadership changes,
etc., etc. Martin and Fitzpatrick, while using the term policy to note
"general operating principles derived from theory which are used to
guide the development and use of concrete services and program
activities,"
explained that policy formulation, development and
implementation are not always theory dependent or as the result of a
theory. Specifically they note
...policy is not settled simply on the basis of theory. Policy is also dictated by acceptable social values and philosophies. Furthermore, the progression is not always from theory to policy to program. Sometimes program leads back to policy and even to theory. For example, in times of crisis, programs may be established because of the need to do something - anything - and out of this, policy, even a theoretical justification, may emerge. Then too, experience with a particular program may suggest ways in which theory and policy should be modified and changed.
Very rarely does a governmental system formally announce a change in policy and ask for comments. The only case in which this must occur is in the area of changes to the executive law that governs the rules most departments in the executive branch follow. If no law changes are needed then the proposed changes are not spelled out in detail. How do we then evaluate a course change?
A policy change occurred when community corrections switched from judging officer performance by raw caseload size, to a level of supervision workload method. This change altered the course that the agencies would follow in the future. This new course also set many other factors and policies in motion. Initially the change was classification by emphasis applied, to various supervision levels set by the individual officers. Then formalization occurred with definitive instruments, such as risk and needs scales. This change in policy resulted from the realization that not every case could receive the same emphasis and that the system wished to determine where the emphasis was to be applied.
Risk Assessment Instruments determine the failure potential of a probationer and they thereby alert the probation system and individual officer to that potential. Once the hazardous case is identified, either increased emphasis can be applied or specialized programs can be brought to bear, to prevent failure or limit the degree of failure.
Risk Assessment Instruments for the probation field are usually arranged as a check listed questionnaire concerning the probationer. Demographic or characteristic items that are deemed of importance are arranged along the left. An item value that corresponds to the probationer, is placed in the right hand column. The right hand column is then totaled at the completion of scoring and the resulting score yields the level of supervision. Degrees of supervision usually are arranged in three or four groups, such as intensive, active and special or high, medium and low.
Need Assessment Instruments are frequently used in conjunction with the Risk Assessment Instrument. The scoring format is the same as the risk instruments. When they are used, usually the level of supervision is set by the highest of risk or needs score.
Risk Assessment Instruments fulfill an organizational need within the probation system. This desperate need as an organizational device exists both on the state and local level. Substantially higher caseloads over recent years have forced the realization that not all cases can receive the same degree of emphasis. The community corrections system, because of this realization, has thus accepted the premise that the majority of the cases will have cursory contact with the officer. In view of this realization, only some cases will have moderate contact with the officer and the smallest portion will have relatively intense contact. Prior to the acceptance of this new order, officer performance had been judged by the number of cases they were handling at any given moment and officers arbitrarily set contact frequency between themselves and the probationer.
The advent of the Risk Assessment Instrument however allowed cases to be formally classified regarding contact frequency and this ushered in the policy of differential supervision. Differential supervision was initially known as the Wisconsin Plan. Since the promotion of the concept policy by the National Institute of Corrections it has increasingly become known as the NIC Proposal.
In this new thrust the old order of raw caseload size has been redefined to be the relative numbers of probationers in each degree of supervision category. No longer are the standards raw numbers of probationers but a workload time measurement of the total individual officer's caseload; in each type of contact category.
This new policy, of differential supervision, has the effect of implying to budget departments and heads of different units of government that something scientific occurs on a probation officer's caseload. The new policy, of differential supervision, also insures administrators that successful case management is occurring because each case must be administratively reviewed periodically. At that time the supervision category is continued or changed. Risk Assessment Instruments have thus accomplished a host of organizational needs within the probation system and they have become a permanent feature of the organization.
Probation started as a cottage industry. Initially probation serviced the court directly. Later they developed into a small agency. The organizational system that developed in these small agencies was rather haphazard. Officers were assigned cases and officers completed reports for the courts. The development of formalized case management and classification instruments were a consequence of the need for management to provide oversight as these small agencies became large and complex.
As the small agencies grew larger and the volume of cases grew larger some specialization occurred in the cottage industry. The first phase of this specialization was along the dimension of the major types of courts served. The criminal and family courts were then differentiated. The next phase of this specialization was the separation of the supervision and investigative functions. Some officers would work with those sentenced and some officers would complete the reports requested by the courts that the agency served.
As the cottage agency grew the truly generic officer was abandoned and specialization came to the field. Supervision of the cases however was still generalized in supervision with the only gauge being the number of cases. The management oversight of the cases was only through the Supervising Probation Officer; the individual that both had experience as a line officer but now supervised and directed a unit of officers.
In the early phases this system worked quite well. The raw number of cases, by the standards of today, was low and the individual officers knew their probationers and the supervisors were aware of what was happening with the cases and how the officers were functioning. All higher management was aware of, was: the number of cases an officer had, the number of officers the supervisor had and the number of cases in a supervision unit. That was all the information they needed to know or cared to know. If an officer acquired too many cases, that supervisor would compensate. He would either restrict the flow into that officer's caseload or reassign cases to compensate. As a unit received too many cases or officers, management assigned more officers or another supervisor. Quality and officer performance were always insured by the supervising officer.
As the systems grew larger this simple arrangement worked quite well until the ability to assign more officers and supervisors became impeded because of growth restrictions. The main cause of this restriction was the increased growth of the institutional component of corrections.
With thirty cases per officer and five officers per supervisor it can be seen that the officers would know their cases and the supervisor could know something of the more difficult cases in this pool of 150 cases in the unit. As the number of cases per officer reached 50 cases the pool became 250 in the unit. At 100 per officer there were now 500 cases in the unit. When the system compensated by adding more officers the ratio of supervisors did not remain constant. A unit then could have 150 cases per officer and possibly 10 officers per supervisor. The unit thus had 1,500 cases. Somewhere along this continuum the systems ability to provide managerial oversight breaks down. At that point the supervisor is no longer able to insure consistent quality and officer performance and no other mechanism was on hand to fill that void.
The first managerial questions that emerged, given the absolute that the number of cases per officer cannot be reduced and that no more supervisors can be assigned, is are all the officers working equally well? A corollary to this is how do we know where to assign resources? This is a normal assumption given that the supervisor cannot actively monitor all aspects of the individual officers work. The supervisor at those high ratios can only react to crisis situations and they all would be requesting additional resources.
The second major question would be, are all the cases that might reflect negatively on the agency being adequately supervised? Again, since the supervisor is only reacting to crisis cases and situations, management has no mechanism to judge this situation.
The third major question is, how do we absorb more cases per officer, when the number of cases per officer is already so high and how in the future do we demonstrate that we require more officers when no one accepts recommended caseload sizes?
The policy implications of Risk Assessment Instruments have been manifold. Prior to the adoption of Risk Assessment Instruments there was a belief that a wide range of probationer types was being sentenced to probation and that each officer was expected to approach probationers with different strategies. Probationers were seen as clients and the measure of the work performed was the number of clients that were treated. The policy implication of Risk Assessment Instruments assumes that not all probationers need or require the same case emphasis. It also assumes that this case emphasis can largely be determined by an instrument whose emphasis is only the offender and that only contact levels need to be changed for different probationer risk types. The result of the pursuit of this policy, of Risk Assessment and Differential Supervision, has been that a larger number of probationers are being handled with only small increases in staff.
The forces that propelled classification into Probation did not
react upon probation until quite recently. Petersilia and Turner note
that "classification instruments began to influence probation field
services in the mid-1970s..."
Citing a State of Wisconsin document,
they note that, "the probation departments were in serious need of an
appropriate and systematic way to allocate their limited staff
resources."
This is consistent with the popular belief in the probation
field, that the state agency was in serious danger of budget cuts
unless it could be demonstrated to the state legislature that
something scientific occurs on a probation caseload. In actuality, the
funds were not in danger of being lost only the release of the funds
was dependent upon the development of a workload measure. "The
final product of classification was expected to be a method for
deploying staff based on workload reported in each office."

Figure 21 National State Prisoner Population Growth from 1925 to 1986
Source: Historical Statistics on Prisoners: 5-13.
The use of
these instruments
occurred when the
institutional
corrections
populations were
rapidly expanding.
The accompanying
chart notes the
almost relentless
increase in total state prisoners in this country, at that time.
Specifically note the constant increase from the mid nineteen
seventies until the present time.
These classification systems were almost all patterned after the Wisconsin system after the intervention of the National Institute of Corrections (NIC). This cloning and mutational replication, primarily was the result of the NIC Model Probation System. The Wisconsin system was a total information package of which the risk assessment was only a portion. The advantage of the Wisconsin System was the totality of the management package and this formed the impetus for the rapid spread.
The Wisconsin system had the advantage of being a total management information system. It answered those three management questions: are all the officers working equally well, how do we assign resources and are those cases that make us vulnerable being adequately supervised?
Essentially, it was composed of three parts. Initially the Risk/Need Classification was used to assign probationers to their specific supervision levels. Caseloads were then balanced by Staff Deployment by workload and then various case supervision modalities were used in reference to the cases.
The risk level set the relative failure level within the community and classified the individual into one of the contact categories. A need score was also used to assign cases to contact categories. It differed from the risk score by not specifying failure potential but by specifying the amount of time needed for involvement with the individual. The actual assignment into a category was set by a range of scores that could be adjusted to fit the specific characteristics of the department.
Once the numbers of individuals were established in each supervision category, then staff could be deployed to balance out the system. This was a significant change in past policy. For now it was not the numbers of individuals who were being treated but a measure of the time required to supervise them and to deal with their specific needs.
The last phase of the system, supervision modalities, was not an essential feature of the management information system. It has not been replicated in many of the agencies in which the system has been transported. This portion was composed of a 45 item structured interview, which yielded a classification scheme of theoretical supervision strategies for theoretical client types.
The Policy Problems That Developed
Classification initially was dominated by the clinical method.
In this case an experienced practitioner made an educated judgment
concerning the likelihood of the behavior occurring or not occurring,
or the type of behavior that might occur. This personal expertise was
then challenged by the actuarial method and the clear superiority of
this method over most individual diagnosis for groups has been
consistent. Gottfredson (1967) explains this as a difference between
"wide band" and "high fidelity" approaches.
The wide band
approaches includes procedures such as interviews, projective
testing, written evaluations by clinical or custodial institution staff or a
social history report. Wide band procedures, he notes, have been
"found to be unsatisfactory, by any usual standards of reliability and
validity, for prediction of specified behavior."
Noting interviews, he
states, "repeatedly, comparisons have shown statistical prediction
devices to be more valid."
Steadman in summarizing the literature
"ranging from academic performance to job turnover"
notes that
"statistical prediction consistently has been more accurate than
clinical predictions"
even for violent behavior of the mentally ill.
Monahan simply states, "In virtually all of the studies that have tried to
compare clinicians and actuarial tables in predicting the same events,
the tables have proved the more accurate"
For violent behavior
however, he concedes, the results have been mixed.
The actuarial method was a specific policy choice made in classification for risk. The actuarial method however seeks to draw distinctions between groups of offenders by utilizing selected items from individual offenders' backgrounds. The first major difficulty in the actuarial method is that it seeks to separate the population into two groups by the incorporation of any variable that will enlarge this difference. While it points out extremes, it does not enlarge our knowledge of the factors that impinge upon the offender to cause movement from one major group to the other. Variables are selected that highlight the differences, rather then demonstrate why those differences occur. For example, a major variable that has been used is the offender's prior record. While this item is very powerful in distinguishing between those groups of individuals who are likely or not likely to again engage in criminal conduct it does not illuminate the factors that cause this to be so. The medical analogy to this item would be chest x-ray to determine if tuberculosis is present. While the procedure does effectively determine the condition, it does not shed light on how the condition was brought about. In the area of community corrections, finely tuning prior record to determine those groups likely to recidivate, yields that criminals will be criminals, not how they came to be that way. The focus of the actuarial method is thus to draw distinctions, not to expand knowledge.
The actuarial method, since it seeks power in distinguishing between groups, will use any variable that fits the purpose. Thought is not given to establishing a theme in the variable mix, that would correspond to a theory of criminology. This open ended tendency toward variable selection then lets the technique open to the introduction of bias in variable selection, because there is no stated framework for variable selection.
While the actuarial method could be used upon sub groups of offenders, it usually is not. The focus has been to determine those likely to recidivate among the whole offender population. Crime is thus viewed as a general disease that causes all manifestations of the negative behavior. If the focus was to explain criminal behavior then the emphasis would be on more limited theory concerning offender sub groups, with the hope of linking the individual theories at a later point. The actuarial method as now used, assumes that those that specialize in robbery are similar and have the same ontogeny as those that special in petit larceny. Obviously, this is not the case. This emphasis on evaluating the whole offender population has a value as a general screening instrument to be used for administrative purposes but limited value as a diagnostic aid for the individual officer or for society. The practical analogy of this premise, in the medical field, would be the thought of devising a simple cheap medical test that would diagnose everyone in a hospital.
Risk Assessment Instruments also restrict individual judgment. Some may say, that the restriction of individual judgment, is a virtue to be pursued and that maybe true at times. At other times however, with false positives and negatives mere humans must be able to insert proper corrections. When the score, of the instrument, becomes the word of god, the individual is unwilling to challenge the divine authority. This is the real danger with imperfect instruments that are deemed perfect.
Diagnostic instruments are used to determine maladies and problems within individuals. Classification instruments are used to place individuals in groups of similar expected performance. This is the essential difference. In one case we are very concerned that the test of the individual malady is extremely accurate and we normally seek other tests of confirmation. In classifying for a group decision, the impact is a group decision and the impact is blurred.
This simple question has a major impact upon the type of instrument that is developed. Many items, such as sex, could discriminate very well in the instruments but policy decisions were made to eliminate items that were beyond the control of offenders.
Hoffman notes
this view as he describes the debate between
"Just Deserts" and "Prediction". This view is a much broader view
then the application of the instruments in community corrections but it
relates to policy decisions in the selection of items to be contained
within the instruments.
In this view the policy of using the instruments should be to fit the punishment to the seriousness and frequency of the offense pattern. The instruments thus guide the level of negative reinforcement that the system can bring to bear.
In this view the sole purpose of the instruments is to determine the likelihood of future negative behavior. The difficulty with this scheme is that future negative acts are not a clear cut item and some aspect of just deserts, such as criminal history items, overlap with the just deserts items.
Hoffman also notes that these two divergent views will coexist
in the future, "just as concerns for 'crime control' and 'due process'
coexist."
These policies indicate two different solution paths, yet
they coexist and develop together.
An ethical argument exists that only those items which it is
reasonably expected that the offender can have control over should
be included. Items such as age, sex, race and national origin are
beyond the offenders' control. Bohnstedt notes that these items
might pose legal challenges. Specifically he notes
The accuracy of classification instruments also could be challenged. Unless an instrument has been validated--that is, unless it has been shown to measure what it purports to measure--its use could pose legal problems. Using such instruments to determine level of risk is analogous to using employment tests that have not been shown to be job-related. Care must also be taken to rule out selection criteria based on race, sex, age, or other variables that discriminate against individuals for reasons that, although related to recidivism, are beyond the control of the individual and not necessarily related causally to crime.
Hoffman goes even further by stating that employment stability items may also be beyond the offenders' control, especially in times of recession.
Race is an item that is beyond an offenders' control but it is also an item that had take on a larger meaning then the other items that fall into this category. Not only is the current belief that race should be left out of the instrument but that it should be insured that race is not included by being correlated with any other items. The level to which this is a pursued is a policy decision.
Schmidt and Witte take the view that we must insure that the
model developed does not include any hidden items. They argue
that the best way to insure that this does not happen is to account for
the effect in the first stage of the model. They explain that building a
model involves a two step process.
It is important to emphasize that the only way to be sure that an unacceptable piece of information (e.g., race) is not included in the second step of this process is to include it in the first step.
Otherwise if race is highly correlated with another
Variable, it will not become known.
These are items such as where the offenders live, the school system they attended, etc. While these items are somewhat under the offenders' control they are usually not included for theoretical reasons.
The risk instruments by their very focus on individual demographic variables assume that the reason there is criminal behavior is that something is wrong with the individual and that this difference can be determined by the instrument. They do not assume that there is anything wrong within the area that the person is living or his peers; only the person is suspect. Any reference to societal issues seems to be avoided.
Certainly we know that the community corrections populations are more concentrated in certain census tracks. To say however that you are more in danger of committing a criminal offense if you live in such a census track, would have policy implications beyond the instruments and the unit of government which they serve.
Increased Or Reduced Punishment?
The major policy choice is to select individuals for greater or lesser punishment. In our system of government, the popular view has always been that individuals are sentenced on the basis of the antisocial behavior they have committed. For Risk Assessment Instruments to have a wider application, to include sentencing, then this popular view would also have to include the belief that we can somehow predict human behavior. This is precisely the course that Greenwood attempted to take with Selective Incapacitation. Only after citizens believe in the prediction of human behavior, could citizens then be punished on the basis of predictions of future antisocial behavior. I personally doubt that such a vast transformation could ever take place, on that direct a level. On other less direct levels, such a transformation can and probably will take place.
Even the move to determinate Sentencing is a move to Risk Assessment, of sorts. The determinate sentencing grid is usually composed of the offense and some form of criminal history. As noted previously, criminal history is usually regarded as a primary variable in the area of criminal prediction. The proponents of a just desert model can say that someone is being sentenced on the basis of past and present behavior but for what purpose; were they not punished for the previous crimes. Or is the tone, we punished you before and you did not listen, therefore we will punish you in the increased following manner and maybe now you'll listen. The increased manner implies that it will somehow now change future behavior.
For those already punished and in the system, the stated use of Risk Assessment does not imply that we are punishing them on the basis of future predicted antisocial behavior. Their potential has already been identified. The premise is now they are selected because they are better risks. We are now withdrawing some pain because they do not need it. It has thus been transformed to a lofty ideal. This is a subtle, but important difference in the stated focus. We are helping the good risks by identifying the bad risks; or to state in another way, we are using the instrument to identify the people whom we really should be placing our emphasis on. The good risks can therefore be diverted from custody, released from custody, or placed in less structured programs. The reduction in bodies can then allow the system to continue to operate. The prison system benefits by allowing more offenders to be supervised on larger caseloads with a differential caseload capacity. The probation system benefits by gaining differential caseloads, a program selection device, and the aura of doing something scientific with probationers. Risk Assessment exists in probation because of its benefits not its empiricism.
Risk Assessment could force mission statements and the acknowledgment of some theoretical base, for policy decisions. To accomplish the mission, some theory should be operating. This would result in some form of policy review that would improve the systems functioning, in relation to mission accomplishment.
When they are properly used, the risk definition will correspond to the system mission. Mission accomplishment will then have a plan of action and risk assessment will allow the resources of the department to be committed in some logical format.
The criterion of failure determines the amount of failure that must be accounted for. The amount of failure determines: the base rate, the possible level of variance explained and the focus of the instrument. It thus is a very important policy decision.
The essential question is the intent of the instrument. That choice is a policy decision. The problem that occurs is the choice between a wide band and a specific focus for the instrument. This choice affects the focus of the instrument in relation to the mission of the department. We can illustrate the difference in a medical analogy. A Tuberculosis test might be widely used but the intent of the test would be to determine if the individual is infected with the bacillus. Certain known procedures would then be implemented to deal with the condition. Blood pressure tests are also widely used but the focus is different. The focus is much more general. Blood pressure readings indicate a mere deviation from the norm. The reading could be indicative of high, normal or low blood pressure. In each category different procedures would have to be used, depending on the cause of the malady. The reading thus would classify the individual
HIGH FREQUENCY
Any rules violation
Any rules violation
leading to a hearing
MODERATE FREQUENCY
Any arrest
Any conviction
Any felony arrest
Any felony conviction
LOW FREQUENCY
Return to prison
Any arrest for a violent
crime
Any conviction for a violent
crime
5Criterion by Frequency of OccurrenceSource: Clear 1988: 18.
into a group that has certain characteristics but it does not in general indicate why the situation occurred or the suggested treatment path. The choice of the focus of the instrument is a policy one related to its professional use. Will it just identify and rank individuals that are deviant regarding criminality; or will it attempt to identify why the criminality is present and recommend courses of action to accomplish the mission.
The criteria, according to Clear
can be roughly grouped into
three categories as demonstrated in the accompanying chart. When
the criteria increase in frequency, the focus of the instrument
becomes larger. It then ceases to be diagnostic and only
administrative or case assignment ends are satisfied. The choice
between the highest variance explained and a diagnostic use, is a
policy decision related to the instrument's intent. It is not a choice
that should be made by the instrument developers. Clear aptly states
this when he speaks of identifying the truly violent offenders in the
system.
...the decision to count all violators as failures will help to produce excellent discrimination but will not necessarily orient the correctional practice toward only the most violent offenders. Whether this is a reasonable approach is a matter of policy, not technical research.
Policy and theory also effect the choice of variables that are
selected for use in the instruments. The Rand Corporation surveyed
the major instruments
and found certain patterns in the variables that
were selected for use in sentencing, parole and probation guidelines.
While this survey will be more fully explored in the chapter entitled
"Current Status of Risk Assessment," a few caveats should be
explored now.
The Rand survey noted that only two items were used, in more than 50% of the guidelines of all three types. These were criminal history items that related more to the "Just Deserts model." Specifically they were the number of prior adult and juvenile convictions and whether their status, at the time of arrest, was being on parole or probation at the time.
Policy decisions also affect the dispersal of the items along functional grounds. Although parole and probation both deal with community corrections, there is a noticeable difference between items that are selected and the frequencies of their use. This would suggest that item selection is also dependent upon a sub theory or policy that differs among the community correction components.
A major policy implication of Risk Assessment that must be faced, is the area of false positives and negatives. This must occur even if some agreement cannot be reached on who is the risk, what is the risk, and how shall it be dealt with. If adequate instruments could be developed, we must still accept the concept, that they will not be perfect. Some individuals will be false positives, in that they are labeled a risk when they are not. Some individuals will be false negatives, in that they will be labeled not a risk when they are. The rate of false positives and negatives is a policy decision that must be made and interpreted in light of the agency's policies and not in the light of how good the researchers say the numbers are. What the optimum balance will be, has still not been resolved nor adequately debated.
Another problem appears to be the use of legal history information for prediction. While most studies concede that it is an important variable, this writer believes it is not being properly used. How the legal history items are structured is a policy question. They should not for the sake of expediency, just be entered into a computer data base. The current systems merely count offenses or they record a limited time frame (time since last offense, etc.) and thus a great deal of information is lost.
When a District Attorney or Judge examines a criminal history, they are seeing logical relationships that are not being accounted for in current crude research methods. The slope of the offense pattern, the timing of offenses between each other, and characteristic patterns of offenses, etc., yields a good deal of information, yet they are ignored in most research schemes. The problem in this area appears to be the devising of an artificial intelligence algorithm to mimic the decision process, rather than just a method to structure it for an already conceived research tool such as SPSS. The choice to improve the instrument through a more sophisticated use of this area is a policy decision.
Peace Officer vs Social Worker
The current instruments, crude and as unreliable as they are, however, also suffer from a large subjectivity problem. This writer will concede that since they are formed by the dominant culture for agencies of social control, they have a dominant culture bias, and probably always will. The subjectivity arises from the way the instruments mirror a subcultural focus along a social worker, peace officer continuum.
The idea that the probation field has long been dominated
and torn by the two major groups, of social workers and law
enforcement officers, has always been accepted. Many different
terms have been used to describe this situation. Morgan and Lindner
explain the situation as, "the role conflict inherent in the probation
officers duties."
Using the terms sworn peace officer and treatment
agent, they state
As a sworn peace officer ... the probation officer is most likely to emphasize surveillance and control functions, holding the protection of the community as paramount to rehabilitative responsibilities....In the treatment role, the probation officer is most likely to function as a helper in the rehabilitative process, aiding the client in individual growth toward a productive, crime-free life.
Harris used the phrase "torn between factions that view
themselves as part o£ the helping professions and factions more
oriented toward law enforcement.
"
The essential point however is that there are at least two major camps within the probation field. These are the followers of the social work and the medical model, and those that belief in a peace officer role. How these two factions perceive themselves has a great deal to do with how problems and solutions are structured.
In the medical/social work model it matters not why the client is brought into the treatment setting. Often clients are brought in for dizziness only to find that they have high blood pressure or that their fear of closed spaces stems from a childhood trauma or repression. The mechanism by which they enter treatment is only a symptom and has little significance in the goal of treating the sick or disadvantaged client/patient.
In the peace officer model no sickness is assumed and disadvantage is ignored. The emphasis is placed upon the reasons that the offender entered the system and energies are directed at insuring that the offensive behavior is not repeated. The two models thus differ greatly in their perspective
Deviant behavior is viewed as a rational act of a fully aware individual, by the criminal justice camp. The offender is thus punished sufficiently to insure compliance with socially approved norms or monitored in such a fashion to inspire fear and lawfully socially approved behavior. Deviance is viewed as a sign of sickness or social disadvantage, by the social work/medical model camp. The solution to socially acceptable behavior is thus to treat the sick mind and restore proper functioning or to improve the individuals social position, to a position where criminal behavior will no longer be necessary.
Given the above models it is assumed that the social work/medical model camp would thus not find the offense that brought the probationer into the probation system as relevant but only some outpouring of his sickness. The peace officer camp would consider the current offense as a sign of relative severity. Their only goal would be the elimination of the socially condemned behavior. The peace officer camp would thus certainly draw distinctions between the treatment of felons and misdemeanants. The client centered group would not. The social work/medical group would thus classify threatening individuals according to their sickness quotient. The criminal justice/justice group would classify offenders according to their current and prior history of offensive behavior.
The two groups would also approach the area of the age of the probationer differently. Age is customarily of little importance in disease and mental dysfunction, in the medical/social work/psychiatric view. Age and crime are highly related, in the peace officer view. It is thus assumed that one group would minimize this variable and one would exalt it.
The problem is that the followers of each camp are not in key positions by any logical structure. In 1979 an interesting article appeared regarding weapon use by probation and parole officers on a national scale. It is of interest because the issue of weapon use does a remarkably good job on the separation of the subcultures. The most interesting result of the national questionnaire was
that the real need, or lack of need, for carrying a weapon
has almost nothing to do with whether the statute defines
agents as peace officers and has nothing to do with whether
the agents work in the urban ghettos or rural countryside.
What the author concluded was the primary factor was, "the overall
philosophy of the role and function of the probation or parole officer
which has developed in the particular agency."
Such signs of
operating philosophy, may be useful in determining the underlying
philosophy of the departments and the underlying philosophy of the
risk instrument. No such correlations have been attempted. This
aspect of subjectivity in the development of the risk instrument is thus
open for further clarification.
Policy decisions Are Made By Others
The danger and policy implication of Risk Assessment is that they put control of the many into the hands of the few. Frequently this occurs with little oversight and almost no public debate. The New York State instrument, for example, was developed by one Senior Probation Program Analyst and three summer interns. They were also probably under intense pressure to get the job done yesterday, so the new program could be implemented today. The subjectivity of the instrument has affected all departments in the state, million dollar programs, a multitude of administrators, a few thousand probation officers, thousands of probationers, and an untold number of citizens. Yet the belief was that the development of risk assessment is science and therefore somehow removed from the basic policy considerations. As noted however, in the review of the literature, risk assessment is an art and not a science. All the beliefs, theories and whims apparent in policy decisions should come into play when they are designed. To leave the development, of the instruments to researchers is a serious policy mistake.
The danger of implementing risk assessment instruments more widely and to a greater depth, concerns their subjective nature. This subjectivity includes not only who is the risk but what is the risk and how the risk is to be dealt with. As Shah (1978) so aptly put it, why are some groups deemed more dangerous than others clearly, there are no unbiased standards in that area. Defining who is a risk, or even what is a risk, is open to many different interpretations. In a system of criminal justice where no accepted theory of crime generation and crime control prevails, the greater acceptance of risk assessment would presuppose a solution to the risk; a course of action where one does not exist. A risk solution based upon popular whim or a hidden theory based upon personal beliefs, would be more destructive than the chaos that prevailed in the system without classification.
The Major Policy Decision Of Today
When the community corrections system was receiving ever
increasing numbers of cases and was not allowed to expand in
proportion, classification instruments came into being. Bohnstead
states it succinctly when he states.
The overriding motive behind the development and use of instruments in probation/parole field supervision is the desire to optimize the allocation of resources. The large caseloads typical of most departments preclude intensive supervision of all probationers.
The initial purpose was thus to manage the agencies and not to
identify specific community risks. With more effective management
more cases were also absorbed by the systems and in each
individual officer's caseload.
Regardless of research quality, studies generally report a shift toward assigning more individuals to lower levels of supervision since the introduction of instruments...Of the 23 agencies in the survey sample, over half report that caseloads have shifted significantly toward lower levels of supervision.
This had a significant impact upon the individual agencies being able
to absorb the increased flow into their systems. "Instrument use
tends to divert more cases to lower levels of supervision, a trend that
obviously could reduce costs."
Having said all the above, it also appears that risk assessment on less direct levels is here to stay and it will continue to exist in ever expanding degrees. The policies that drive risk assessment are not the forces of truth, justice and fair play. The policies that drive the further implementations of risk assessment are the survival instincts of the system.
This country already has the highest rates of incarceration in the free world, and the rates are still increasing. A numbing fear of prison riot haunts the state legislative halls and it demands a greater use of "Alternatives to Incarceration." The question is however, how are the individuals to be selected for the alternative. Certainly not in these days, when the system is being criticized for being prejudice, would individual judgment suffice. What is needed is an: objective, empirical, super scientific, risk assessment instrument.
The review of the literature concerning risk assessment yields many facets, on many different levels. The general literature on risk assessment elicits that there has been a growing consensus that the technique is not free of value, culture, nor politics. Even before the technique is applied values, culture and politics have come into play to select the topic where it is to be applied.
In regard to the specifics of predicting human antisocial behavior, in general, the review of the literature has shown that the state of the art is not adequate to the task. In regard to the criminal justice system, the literature has also shown that the state of the art is not adequate to the task.
The biggest problem identified by the literature is the lack of focus, for any variable that yields a correlation will be seized upon for an instrument and thus all variables in the data base are tried. There is no accepted theory, of why crime occurs, applied to the instruments. If there is no theoretical idea of why crime occurs, there is no specific idea on how to reduce it and certainly no indications of the important variables to be used to monitor the potential of it.
As was noted earlier. There is an advantage to recognizing
that risk assessment in community corrections is not science. There
is an advantage to recognizing that the policy decisions that brought
in the instruments were not to fine tune the system but to solve
problems that occurred. As Clark notes
Societal relevant risk is not uncertainty of outcome, or violence of event, or toxicity of substance, or anything of the sort. Rather, it is a perceived inability to cope satisfactorily with the world around us.
He further notes that risk management is not specifically a science at
all.
Risk management lies in the realm of trans-science, of ill-structured problems, of messes. In analyzing risk messes, the central need is to evaluate, order, and structure inevitably incomplete and conflicting knowledge so that the management acts can be chosen with the best possible understanding of current knowledge, its limitations, and its implications. This requires an undertaking in policy analysis, rather than science.
It is in this acknowledgment that the policy related to the instruments can be improved.
Discussions can occur regarding the meaning of the instruments, if the reasons for the change in policy are made known. Decision criteria can be examined and better understood when they are brought out into the open. It can then be discussed, whether the uses of the instruments match the mission of the agency.
The "Needs Assessment Instruments" are a case in point. The original motive to use these devices were to account for agent time spent, not to predict failure. There is however a hazy belief structure and policy statement issued by their use; that these factors are important in reducing recidivism. This belief structure that has been translated into many agencies' policies, by the adoption of a needs' instrument which has never been validated. Agencies have made a policy decision to commit scarce community corrections resources to a belief structure that is untested, unproven, and not directly empirically related to offender failure in the community.
As Petersilia and Turner noted
probation officers, in
Wisconsin, were oriented towards rehabilitation and they were
therefore uncomfortable with a device that forced very frequent
contact levels with probationers likely to fail and low contact levels
with probationers who appeared to be more hopeful. A need's
instrument was thus incorporated into the classification, to allow
higher contact levels with probationers who while not posing a threat,
did require more frequent contact for the purposes of social
casework. It should be noted that this needs instrument was never
empirically validated in Wisconsin nor in other states, such as New
York, which adopted it. Glasser simply notes
that
The Wisconsin system also uses an initial needs' assessment form and, every six months, a needs reassessment form, which are derived not from statistics on past experience but from a consensus of agents on the relative importance of various types of assistance that their clients require.
The authors finally note that most probation departments, "now use a
combination of recidivism-prediction and needs-assessment scores
to assign levels of community supervision."
How the use of the needs assessment instruments is resolved, in terms of belief structure and allocation of scarce public funds will have to be a major policy decision that is occurring today.
Many of the instruments are old now, they are one to two decades old. It is clearly time for revision. Major policy decisions are looming on the immediate horizon. Do the agencies keep using instruments that should have been revalidated long ago? Do the agencies keep using instruments that were never validated? Do they scrap them? Do they make a commitment to the next generation of instruments? These choices are policy decisions and not the decisions of science.
Immediate Future Policy Questions
The primary question is whether the next generation of
instruments will make the theoretical leap by acknowledging a theory
of crime generation? Clearly policy drives the choice of the items of
interest but policy can be driven by acceptable social values and
philosophies as well as theory.
With the huge expenditure of public
funds that is occurring in the community corrections field and the
cataclysmic amount of citizen suffering that is being experienced by
offender community supervision failure, certainly the citizens deserve
more assurances that these gigantic public agencies are operating on
belief structures that are more then popular whims.
Will they be forced to have a minimum standard variance
explained? Clearly if one was operating on an openly stated
theoretical belief one could acknowledge that the theoretical fit was
not perfect and some modifications could be made. If the instrument
were focused to a narrow offender band then one could point to the
low base rate and explain that the variance explained will never be
that high. But to say that the instruments are not theoretical and the
best that science can give is a disservice to the public and the
agencies involved. This is especially true when the majority of the
instruments have included the widest frequency of negative offender
behavior in an attempt to boost the base rate as high as possible.
Even with all the modifications a low level of prediction has resulted.
This has led some agencies to determine that their official use,
is
unacceptable in these circumstances. Only a change in policy can
rectify this situation.
Who will fund the instruments and what conditions will be attached to the funding? This is a major policy decision that can only be addressed on the national level. The National Institute of Corrections provided the incentives to clone these instruments without validation or minimum validation, then possibly it should shoulder the responsibility of rectifying the situation now. A suggested strategy would be to tie all funding to the cooperation between the public agency developing the instrument and an academic institution willing to oversee the public debate of the issues involved and to insure the correctness of the variance explained. It would only be then that the citizens could be sure of the actual operative theory and the relative correctness of the results.
THE CURRENT STATUS OF RISK ASSESSMENT
As noted in previous chapters, the development of risk assessment instruments did not develop with the sole goal of identifying high risk offenders. The systems were not primarily concerned with high risk offender identification or processing. The risk instruments developed in reaction to judicial, legislative and economic pressures. They developed because they solved a small portion of a much larger administrative problem. The focus was not to design the best possible instruments but to attach the risk assessment portion into a larger administrative device that solved a broader problem. The instruments were thus not an end in themselves but a means to an end.
In the case of probation, the systems had lost oversight control over the individual officers' caseloads and they had no way to monitor production or quality control or to effectively allocate resources. The risk instruments developed for probation were an essential piece of a management information system and a way of changing the very way work was defined in the agency. No longer was it the number of individuals treated but the amount of expected time that corresponded to various categories of supervision.
For parole the situation was entirely different. The assessment of risk was essential to the development of guidelines which spelled out the necessary or salient factors needed to make parole. The instruments were a part of a much larger administrative package that was necessary.
It we understand the forces that control situations then we can explain what happens and why it happens. We as a species tend to construct mental models of the processes and we judge expected outcomes by these constructs. They might be right or wrong or even incomplete but we have a basic desire to know why things happen. The model becomes a way to view the world, a mental theoretical construct of reality and the forces of our environment. The essential question is however, how well does our model of the world act like the world.
In the previous chapters the problem of low base rates was discussed and the difficulty of obtaining high correlations with low base rates. It was also shown that most of the instruments have taken a policy position of including a high frequency, wide band approach to raise the base rate and thereby the amount of variance explained. Can we then assume that the variance explained is sufficiently high to validate the use of such instruments.
In 1985 the Rand Corporation issued a report entitled Granting Felons Probation, Public Risks and Alternatives. While it created quite an uproar over its pessimistic evaluations of probation failure, the finer points of the analysis were somehow missed by the field as they scrambled to defend their honor over the report. In the study various models were tried. The one that relates to this thesis is the prediction of probation failure.
The criterion was defined in the moderate frequency range
previous noted by Clear. This includes arrests and convictions but
not the high frequency group of any rules' violations or the low
frequency of a return to prison or any conviction for a violent crime.
Using a very detailed data set and a large data base the level of
variance explained was quite low. We can thus view the upper limits
of moderate frequency range at this level.
The picture is a rather pessimistic one. For all offenders combined, only about 7 to 16 percent of the variance for the three outcomes can be accounted for by significant variables...
An even more pessimistic pronouncement was made, in reaction to
the overall correlations, in the Rand report entitled Guideline Based
Justice.
These results challenge the assumption that a statistical model based on factors associated with recidivism will necessarily accurately predict recidivism.....While these results are disappointing, they are not surprising. A growing body of research indicates that even sophisticated statistical methods and very detailed information have not significantly improved researchers' ability to predict future criminal behavior...As a result, many researchers have concluded that further research along these lines does not seem worthwhile to press.
In the above pronouncement the overall R-square, with all variables
was 0.08.
In Gottfredson's classic quote he explains, "When
normative prediction studies are considered, the proportion of
outcome variance explained rarely exceeds .15-.20, it is often
lower..."
In probation risk instruments a higher frequency criterion is
used. Presumably this is to increase the base rate and thus increase
the level of variance explained. An internal document
concerning
developing a model for predicting probation outcome in Michigan,
explains this fact also.
It is useful at this juncture to note that prediction studies rarely ever report a multiple-r exceeding .40. A value of multiple-r in the range of .25 to .35 should be considered good for prediction purposes. The lower the base rate in the population, and thus in the sample, the lower the multiple-r that is theoretically attainable. The problem of low base rate and the low multiple-r it produces are well documented in prior research. Typically, the studies report a multiple r of roughly .25 with a considerable number reporting less than this value.
Bohnstedt has noted that low accuracy has been a thorny
problem with some agencies
The predictive accuracy of most instruments used in level-of-supervision decisions is not known, but some jurisdictions have concluded that their predictive power is too low to justify their use. [In the accompanying footnote he notes]...The State of California does not use classification instruments to determine level-of-supervision because of low predictive accuracy; and the Los Angeles County Probation Department has abandoned a fairly complex screening and case supervision system for the same reason.
It can thus be stated that the overall variance explained has been low. So low that serious doubts exist as to whether they are grasping at the proper indicators of failure.
Most of the instruments in use are not original. Instruments that were developed locally apparently drew from the same items that were used previously. The analogy would be of cloning of a few originals and mutational replication of a few originals. We thus have exact copies and copies with variations but they are all based on the same models or starting points.
The National Institute of Corrections sponsored a year-long
national survey of screening and classification instruments used in
Criminal Justice. It was conducted by the American Justice Institute
along with the National Council on Crime and Delinquency for the
National Institute of Corrections under a grant (No. AT-2). The goal
was to find distinct instruments in use. Twenty-five distinct
instruments were found. In the 23 probation/parole agencies
surveyed, it was determined that
...44% of the instruments used by agencies in the survey sample were borrowed from other jurisdictions....About 26% of the agencies surveyed had developed their own instruments through a local research program. In some cases, these instruments are variations of instruments developed elsewhere...About 30% of the agencies surveyed had developed their own instruments, but had not based them on local research.
We thus see that almost half were cloned or copied, almost one third
had developed their own instruments intuitively without research and
a little over a fourth had developed an instrument with local research.
The assumption would then be that at least one fourth have some
valid merit, except for the notation that, "Few agencies using
instruments have undertaken research to validate their use, and
much of the research that has been done is methodologically
unsound."
We thus see that the current state of the art is not that sound. The only possible hope would be that the instruments that were allegedly well done could be transported to other areas without modification and still function adequately. This question will be address next.
Three university based researchers
reacted to the Wisconsin
Risk Instrument receiving national attention and promotion through
the adoption of the instrument as a model system by the National
Institute of Corrections. They then attempted to validate some of the
assumptions of the model project.
The researchers noted that the Wisconsin System attracted
national attention because it incorporated a number of elements into
the model system. These included a risk/needs assessment, a
management information system, programmed supervision
classification, and workload accounting of caseloads. This system
was then pushed by NIC (National Institute of Corrections) as a
model system nationwide. The NIC however specified that those
agencies involved in the project adopt an existing risk instrument, the
development of an instrument was not funded by NIC and most
agencies adopted for the Wisconsin Model with minor modifications.
The authors note this policy decision was based on two premises.
1. Most states cannot afford the expense of
developing their own instrument, and
2. Existing instruments predict virtually
equally well on various populations...
The authors then noted that this belief structure as translated into a
national policy decision, was dangerous and it was a departure from
the existing thought on the matter. They specifically noted
This is an important departure from traditional approaches to risk screening. Much existing research on the state-of-the-art of prediction devices suggest that they are rather crude and population-limited. Coefficients are found to be unstable and statistical models are weak in explanatory power.
The NIC project did stipulate that the adopted instruments should be validated early on in their use. The authors noted however that this did not occur; because the agencies were told that they cannot develop their own instruments and that the instruments are transferable among different populations.
The authors then cautioned that the Wisconsin Instruments
have been cloned throughout the U.S by the NIC without extensive
validation studies on populations outside the state of Wisconsin. In
this study it was found that many of the variables in the Wisconsin
Instrument did not predict in NYC and the overall model was weak.
Weak in this sense meant a R2 of .085 for the original instrument with
weights.
Thus 91.5 % of the variance was unexplained.
The authors noted two possible reasons for the extreme
differences in variance explained between Wisconsin and New York.
The first suggested that the offender populations were possibly quite
different, "maybe New York is not Milwaukee."
The other stated
reason was
maybe statistical prediction methods are so poorly developed - so unstable as models - that transfer of models is questionable just on the grounds of limited technology alone.
The authors then raise serious concerns about the state of
the art of risk assessment in community corrections. They thus
cautioned:
Our analysis raises serious questions about the state-of-the-art of statistical risk prediction. First, we have shown that models developed on one population do not necessarily transfer to other populations. Second, we have confirmed other research showing that the models themselves can be fairly unstable, and the variables eventually designed into a screening device may result from statistical factors in the sample.
New York City and Wisconsin are quite different. The crime rates and by implication the types of probationers received for supervision in Wisconsin and New York City are quite different. Currently the percentage of felony probationers in New York City is approximately 85% with only a tiny fraction resulting from felony Driving While Intoxicated offenses. The criterion of failure was also very broad in the design of the Wisconsin Instrument. Failure was defined in the moderate frequency range, in this study.
The high numbers of variables that were unrelated to probation outcome were also possibly related to the large number of individuals that possessed those characteristics. If say sixty percent of the probationers possess those characteristics and the failure rate is substantially below that amount, then the items cease to be predictive. For more individuals that possess those characteristics succeed, then fail.
Employment/Educational Status
Drug/Alcohol Involvement
Family/Social Factors
Number/Type of Prior Convictions
6 The Four Variables used more then Fifty Percent of the Time in Risk InstrumentsSource: Classification Instruments For Criminal Justice Decisions: 10.
Bohnstedt's survey of the twenty-one instruments mentioned
earlier also revealed, in a
chart
, that four items
were used in the
instruments more than fifty
percent of the time. They
are listed in the
accompanying chart. As
far as frequency of use is concerned, Employment/Educational
Status was used in 19 instruments. The item in the top fifty percent
that was used least was the only criminal history item in the group.
The Number or Type of Prior Convictions was only used in 12
instruments.
Number of Prior Convictions
Auto theft or not
Lives with Spouse/Children
Previous Parole or Not
7The Top Four Parole Prediction VariablesSource: LEAA Classification for Parole Decision Policy: 311-312.
The top four predictive items for parole prediction, as listed in
1978,
are included in the following chart. These four items explained
"practically all the variation in parole outcome over 2,500 prisoners."
In the Rand report entitled Guideline Based Justice a listing is
made of the factors included in most sentencing, community
supervision and parole release instruments.
It is of interest to note
that items involving the nature of the current crime are heavily used in
sentencing and parole release but not used in community
supervision. Social factor items are heavily used in community
supervision instruments, only moderately used in parole release
instruments and not used in sentencing instruments. Obviously there
is some disagreement concerning what should be measured at
various points in the system and the question immediately arises, are
the perspectives of the three views of the offender that different? Are
the policy points of view that different?
Criminal Record Modeling Problems
Criminal history items have always been shown to be excellent predictors. A pervasive problem appears to be the modeling of the criminal record items. This has occurred regardless of the type of instrument: whether community supervision, parole release or sentencing. Essentially the ways this important indicator is modeled does not approach the way the humans in the criminal justice system view it.
Criminal record is usually modeled by counting offenses, or counting felony offenses or jail terms served, etc. A numerical count may be the easiest way a computer can handle the information but a great deal of information that humans can see in the record is lost in a simple counting method.
Humans can see the slope of the offense pattern and determine if the severities of the offenses are decreasing or increasing. Humans can see the timing of the offense pattern. They can determine if the pattern is occasional or concentrated at the current point or if the majority of the offenses occurred previously. Humans can view complex density measures. This information is not even collected in the current state of the art instruments.
Bradshaw speaking of one of his more successful models,
noted
The two key terms in this equation are 'number of years of street time since 14 years of age' (inversely related with general felony arrest) and 'number of prior arrests'. This combination makes considerable intuitive sense, since together these measures constitute an assessment of the "density" of prior criminal activity.
The question is why have not the current state of the art instruments tried to model the criminal history items better?
As noted previously, three rationales explain why the instruments are used. The instruments complete a portion of a much larger managerial system. A way to judge and measure the workload of an agency that goes beyond a mere count of the number of cases. Classification has become an essential way to allocate limited resources and the way this concept can be expressed to the public and funding bodies is on the basis of risk. The instruments also have the advantage of making standard decisions in a field that was noted for individuality.
A third rationale is seldom stated but very apparent. The use of the instrument and the classification systems result in more offenders being classified to lower levels of supervision. This effect coupled with the workload measures, allows more offenders to be carried on each officers caseload. More cases can thus be assumed with no increase in staff.
All the instruments have an individual focus that patterns the social work perspective. The essential premise is that the reason failure will occur is that there is something different or wrong with the individual. The premise of the instrument is that this difference can be determined and the individual identified. Once they are identified then programs can be administered to correct the problem or eliminate the difference.
The focus is individual differences but certain specific individual characteristics are excluded. The obvious differences such as race, sex and age are excluded because of possible legal challenges. The differences that are within the offenders control, such as previous convictions or items that are not indicative of middle class life style are deemed of importance. Items that reflect the social forces acting upon the individual are not included.
Weaknesses Of The Current Instruments
The most essential criticism of the current state of the
instruments is the bland generality they express. No measures of the
societal impact of the risk exist and they assume that all offenders
suffer from the same malady. To the instruments of today the risks of
failure of a petty larcenist
and a child rapist are somehow equal
because the measure of failure is another conviction. Individuals who
specialize in armed robberies and those that repeatedly drive drunk
are also somehow expected to be cast from the same mold and
behave in similar fashions. One gets the impression that this bland
generality has impeded the public discussion of what is the risk to the
community and hindered meaningful research into how various
offenders should be handled.
The generality has also masked some of the obvious shortcomings of the instruments. A first time offender over the age of 24 is really not noticed by the instruments and the unique characteristics of driving while intoxicated and sex offenders are totally missed.
The instruments also focus on the factors that preceded the individuals' entry into the correctional process. They thus really discount the correctional process by not accounting for it. Logically if the correctional process, whether institutional or community based, was successful then the instruments would cease to be predictive after the correctional effect.
The instruments are also over a decade old. This in itself would cause suspicions that the offender population has changed and many of the items would no longer be predictive.
UNRESOLVED ISSUES AND PROBLEMS
The central questions are will new instruments be developed and how will they be different from the present instruments in use? These questions are related to three broad classes of questions, which are of course also related to each other. These include the areas of belief structures, technology and policy. For change to occur in the current state of the art, change must also occur within these three areas.
By belief structures I refer to social values and philosophies' operative in the area. These may or may not be grounded in fact or theory. They may even run counter to experience but are held in regard, cherished and pursued like a noble dream by their followers.
Community corrections and specifically probation has been
dominated by one major belief structure throughout its development.
"Social casework has persistently been the single most common
paradigm for professional practice"
in probation. As noted in
previous chapters, the field is currently torn by the ideological
inconsistencies of the social work and peace officer camps. The
social workers have however been the owners of the territory and
they are currently in most of the key positions within the field. They
are the dominant force.
Previously the initial objections to the use of statistical
instruments were noted as
...statistical instruments which attempt to predict human behavior are incompatible with the principle of individualization which has been regarded as very fundamental in social casework. Indeed, the assumption is made that the only possible basis for the appraisal of a case and for prognosis is through the individualistic approach and that the individualistic approach has no relationship to statistical computations.
The real question is was this objection overcome, adapted to, compensated for or somehow neutralized? Has the entry of the instruments offended the social casework model? Has it changed the role or has it been changed to accommodate this role?
The probation professional has thus been cast in a certain
role and this role has encompassed a number of facets. The
probation professional is normally viewed as a helper of adjustment,
as a treatment professional and as a referral agent. As Pierson
notes
The medical model, which has been of enormous influence in the formulation of the social work method, sees the client as someone who is ill, who comes to the practitioner for relief of symptoms.
Much of the role then can be viewed in light of what the medical
profession does and how it works with its clients. Such practices as
focusing on office visits, taking a detailed history of the individual and
making referrals to other related professionals are also well known in
the probation field. Pierson also notes how psychology has
influenced social work by focusing on what is considered normal
behavior within the general population.
Essentially then we have the view that there is nothing wrong with the middle class helper but something is wrong with the individual they are treating. The middle class helper looks at themselves and says.
1) I have an education
2) I come from an intact family
3) I do not abuse drugs or alcohol
4) I have always been employed
5) I have always lived in a stable residence
6) I have always been law abiding.
This person that I am to help is not law abiding because he is not like me. Using himself as a reference point he then determines that the person is not law abiding because he differs in one of the previous listed areas.
As noted previously, in chapter 7, the top four items used in the instruments more than fifty percent of the time were
1) Employment/Educational Status
2) Drug/Alcohol Involvement
3) Family/Social Factors
4) Number/Type of Prior Convictions
It thus appears that the basic objection to the instruments, that they violated the model of social casework, has been neutralized because they now measure social casework ideals. They now follow the same reference points. Even though the instruments are publicly stated to be atheoretical, they do parallel the established belief structures of the dominant view in the field of probation.
Social work focuses on the individual and it seeks to help the individual adapt to a fixed social structure. The problems the individual is experiencing are related to their personal disease and not to any factors that exist in the social structure they exist within. It is the client that is defective, ill and diseased. They will be better, once they are changed.
The focus is thus on individual and not on the social structure they live within. Any references to variables or items that related to the social structure are excluded. They are not within the frame of reference of the field and not enclosed with the information the agency deems necessary to keep.
For the instruments to undergo a major change, the social work focus of the instruments must change. The social work focus cannot change unless the data collected by the agencies change and orientations of the researchers change.
Three central issues are involved in this area. One concerns the delineation of system failure and how much of it can be explained under current theories. The other two concerns are the lack of adequate research related to the above and improvements that can be made in the modeling of certain items.
Currently the criteria of failure are set at the widest, highest frequency level to increase the base rate of failure and thereby the amount of variance explained. A major question that remains is whether some consensus can be achieved regarding an acceptable definition of failure. This however is a policy question that has political ramifications.
At the present time there is no minimum standard of variance explained and no minimum standard concerning the improvement over chance that is acceptable. Certainly no level or ratio concerning the acceptable number of false positives or negatives for any categories of instruments exist, nor is one likely.
Academic and public discussion concerning these matters
would be helpful but the agencies have not opened their research for
outside review, nor have they sought assistance from the academic
area. This secrecy has persisted despite criticism from within the
field that
Few agencies using instruments have undertaken research to validate their use, and much of the research that has been done is methodologically unsound.
The essential question is whether the agencies can be propelled into working with the academic institutions and the public to resolve these issues. Possibly only a few methods of forcing these issues exist. Inducements could be offered by a federal agency. Legal challenges to the low level of prediction and item selection procedures could also be mounted.
There also is difficulty in the modeling of certain items, in the
development of the instruments. This difference revolves around the
way items are structured for presentation to the human mind and
various computer programs for analysis such as SPSS.
The way
that items are structured for presentation for SPSS does not model
the way the human mind grasps certain relationships presented. In
view of this, a great deal of information is not available to statistical
analysis but is available to the human viewing the information. A
major improvement in the ability to organize these data into a
machine acceptable input should yield a great improvement in
prediction.
If the primary focus was only managerial, because of the need to allocate resources, then there is no point in developing the next generation of instruments. They thus will only occur if they are tied to some new management information system. The possibility does exist that they maybe eliminated despite managerial needs. The current instruments maybe eliminated through legal challenges. To maintain the managerial assistance they provide new instruments will have to be developed. The development of new instruments will have to be a policy decision. Either because they are old and must be revised or because they cannot continue as they are but their services are too useful to be discarded.
Will The Stakes Of Failure Be Added?
Currently most definitions of failure either include the arrests
or convictions. Some more detailed definitions also include arrests or
convictions for violent crimes. As previously noted, in chapter 5, one
of the concerns in 1962 was
It is also true, I believe, that the prediction studies have given insufficient attention to the question of the seriousness of the offenses that the prisoner has committed in the past and may commit in the future. Surely it is important to know not only the likelihood of his getting into trouble again, but how serious his infractions may be if he is paroled.
Gottfredson, as a major force in the development of risk instruments,
noted
, this point in the proposed development of better instruments.
When there is a predictive component to the decision, the behavior of the decision maker is similar to that of an informed gambler with a decision made under uncertainty. Not only the odds of winning or losing a bet (risk) but also the amount of the wager (stakes) are considered by the prudent gambler. The expected value of a given bet may be taken as the product of the probability of winning and the amount at risk (the wager). Thus, in addition to measuring the risk of various outcomes of decisions about offenders in terms of discrete general criteria, one can develop measures of the societal stakes involved in the decision. One can hope that this may provide improved information for practical decision situations.
The current instruments have a wide criterion of failure and it makes a good deal of difference to society and the impact to the citizens if a petty larcenist steals another shirt or a child rapist commits another offense. Currently such stakes are not considered nor is there any method to model the stakes. A major question will be whether such possibilities will be accounted for in the future. A decision of this nature lies within technological improvements but the decision to pursue the dream must first flow from a policy decision.
Will Theoretical Instruments Be Developed?
Currently the popular notion is that the instruments are
atheoretical. Criteria are set by the designers and any item that is
predictive can be included into the analysis and no preconceived idea
or thought is attached to its selection inclusion or exclusion.
Currently the popular policy has been to follow this path and to seek a
value free approach to the design of the instruments. Is this really
true, however? As Rescher notes
With individuals, values inhere in the psychological value-system of the person. But with groups they are matters of social decision-making. Where a choice confronts a group rather than an individual, the issue accordingly assumes an essentially political dimension. It is [a] matter of the value-system of the group as made manifest through its collective mechanism of decision making - its political machinery. For the question can be settled only through the group's essentially political determinations in regard to the assessment of the negativities at issue. Such evaluations can emerge only from the active consensus-formation mechanism or from the passive acquiescence through which the group effects it public policy arrangements.
For the instruments to be atheoretical they must not follow any established theoretical path. As noted previously, the instruments do apparently rely quite heavily on measures that are dependent upon a middle class value structure. Whether this format, is conscious or unconscious, does not matter. It only matters that an implied theory does exist.
Even if the concept of an implied theory is not accepted it is
easy to show that the very concept of atheoretic selection has already
been violated. To be atheoretical the choice must be without a
conscious scheme, no guiding principles must exist. We already
know however that a conscious scheme and principles do exist in the
exclusion of perspective items. As Bohnstedt notes
...Care must also be taken to rule out selection criteria based on race, sex, age, or other variables that discriminate against individuals for reasons that, although related to recidivism, are beyond the control of the individual and not necessarily related causally to crime.
Obviously items beyond the offenders' control can be related causally to specific crimes. Physically the capacity just does not exist for certain individuals. There are few female rapists. Not many small stature individuals can be assaultive. The probability for wheelchair bound and blind individuals to be car thieves would also be almost zero. Logically, items beyond the offenders control, are related to criminality, yet they are excluded.
It is therefore impossible to say on one hand that the instruments are ideologically pure and atheoretical because no guiding principles exist and on the other hand say that whole classes of items may be predictive but will be excluded because they maybe offensive. Guiding principles do exist in the selection and exclusion of the items, they are therefore not atheoretical but no public theory is stated.
A major policy decision is thus will a stated theory be established in the design of the instruments and will the use of such theories be allowed to extend the frontiers of our knowledge of crime generation?
Another major question concerns whether the current instruments will face legal challenges. The basis for possible legal challenges concerns their initial development and the fact that they are now old and have not been revalidated. Not being revalidated is a condition that continues to grow in importance, whereas their potential liability in development remains a constant.
As noted previously the instruments could have faced challenges for including items that were beyond the offenders' control. Most instruments have avoided this potential problem by excluding those items. Other areas of potential liability however do exist.
The use of the instruments in classification decisions can be
a potential for legal challenge regardless of the philosophical slant of
the department. If higher contact levels of supervision are used for
increased surveillance then the offender is placed in a higher
potential for apprehension if he is not abiding by the conditions of his
probation. If the department views higher contact levels as needed to
supply additional services then it could be viewed that he is being
denied those services by the instrument.
With either orientation a
legal challenge could result if the instruments cannot prove that they
measure what they allege to measure. Bohnstedt specifically notes
Unless an instrument has been validated--that is, unless it has been shown to measure what it purports to measure--its use could pose legal problems. Using such instruments to determine level of risk is analogous to using employment tests that have not been shown to be job-related.
An offender could also demand to know the criteria that are used to process their case and how reliable that criterion is. As noted previously, this is precisely the issues that helped spark the federal parole guidelines. This is an especially sensitive issue because many of the instruments were never validated, improperly validated or never revalidated after a considerable period of time has passed.
This work is concerned with risk instruments. Levels of supervision are also set by other means such as a needs instrument. The concept of classification on the basis of needs also sprang from the Wisconsin system. The question that remains is even though the level of supervision decisions made by risk instruments may not be valid there are other methods that are used to allocate resources that apparently have no empirical validity and yet are used to allocate resources. A primary question to emerge, will be, how long will this situation also be allowed to continue.
Petersilia and Turner noted
that those probation officers, in
Wisconsin, were oriented towards rehabilitation and they were
therefore uncomfortable with a risk device that forced very frequent
contact levels with probationers likely to fail and low contact levels
with probationers who appeared to be more hopeful. A need's
instrument was thus incorporated into the classification, to allow
higher contact levels with probationers who while not posing a threat,
did require more frequent contact for the purposes of social
casework. The authors finally noted that most probation
departments, "now use a combination of recidivism-prediction and
needs-assessment scores to assign levels of community
supervision."
It should be noted that this needs instrument was never
empirically validated in Wisconsin nor in other states, such as New
York, which adopted it. Glasser simply notes
that
The Wisconsin system also uses an initial needs assessment form and, every six months, a needs reassessment form, which are derived not from statistics on past experience but from a consensus of agents on the relative importance of various types of assistance that their clients require.
The needs instrument was not empirically derived and apparently has not been empirically tested prior to adoption in any location. Scarce community corrections funds however are being allocated to provide increased levels of supervision for a belief structure which may not even have the slightest relationship to reality.
Taxman did however attempt to test the belief structure and
notes
that
...it seems that negotiating services for probationers and having contact with social service agencies does not make much difference in probation outcomes. In the present study, probationer contacts with social service agencies had only a negligible correlation with probation outcomes (r=.06). This study suggests that the social adjustment of the probationer (at least as operationalized here) is not that important in achieving the larger goal of probation agencies, namely reducing the offender's likelihood of recidivism. Addressing probationer needs may not be as important a component in the case structuring process as the literature seems to suggest.
The instruments used in probation and which were validated only explain a small portion of the variance yet most of the probation workforce uses them to set supervision standards. Many instruments were not even developed through limited research and no validation was done. The needs instruments were never intended to be derived empirically but only to account for agent time in the performance of services that many not even be remotely related to the task of reducing recidivism. The question that remains is how long can this massive outlay of public expenditures and agency efforts, for the purpose of addressing community safety, continue in such a haphazard way?
SUGGESTIONS FOR FUTURE STUDIES AND POLICIES
The current instruments used in probation have been in use for more then a decade. Most were based upon poor research or no research at all. Most were merely a copy of an established instrument either in total or portions thereof. The time is at hand to develop the next generation of instruments but how will they be developed, what form will they take and what guiding principles will they follow? Before gazing into the future, we should summarize the present.
The present day instruments were not developed primarily to be diagnostic aids. They were not developed to determine what was wrong with the offender and what should be done to correct the problem. Indeed, the field of probation practice is dominated by social casework theory. No cogent theory for the generation of criminal behavior exists in the belief structure of social casework practice. If no theory of crime generation is present in the field, then the instruments could not be diagnostic and prescriptive because of the lack of a descriptive theory.
They are called "Risk Instruments" but they were not developed to determine only risk to the community. The central theme was not to determine and then reduce the likelihood of offender failure in the community. They were developed to aid management concerning the allocation of resources and the expected effort or work to be expended at each supervision classification level. The thought was, how do we get the job done with limited resources? How do we present workload and staff allocation assignments in a method that funding bodies and executive departments will comprehend? How do we assign staff and judge their performance?
The instruments were designed as a central component in a case classification system that was a new departure in the way that management defined the work performed and how it accounted for work in the community correction setting. No longer was the measure of the number of offenders being supervised by each officer but now it was the relative expected work needed for the number of offenders in each class of supervision contact levels.
Once the amount of work for each officer could be determined then caseloads could be balanced and staff could be assigned in an objective manner. Prescribed contact levels were implied in each supervision category and the category was assigned on the basis of a risk score, then in theory the scores could be adjusted to provide the maximum coverage with the staff at hand.
It then became a simple matter to state to funding bodies. We now see those most likely to fail four times in the office and once at home each month. If our funding is reduced, twenty percent of those individuals will have to be transferred to the next lower level of supervision. They will then be seen twice a month in the office and no home visits will be made. Wise agencies never concluded that recidivism would increase or that the community crime rate would increase. The only implication was, that services as defined by officer contacts would be reduced.
The instruments were touted as being atheoretical. No public discussion or analysis of the agency's mission thus was needed before the items were selected. As noted above, social casework practice has no cogent theory of crime generation. The belief structure in use thus provided no absolute guidance but the items that were used most did mimic the belief structure in use.
The instruments in use, in general have been noted as flowing from poor research and to have low levels of variance explained. They do however fulfill a role within the management for which they were developed.
The instruments that are now in use developed because they
satisfied managerial objectives. New scales will not be developed
unless they are somehow forced through outside pressure
or
because they are better at satisfying the managerial objectives. For a
new instrument or series of instruments to be developed it must
provide better information for resource allocation and budget
justification. It must thus be part of a new more comprehensive
management information system.
One of the most promising areas explored in the literature cited has been the grouping of individuals by offense type. Grouping by offense type appears to be a logical solution to the problem of narrowing the range that an instrument must account for and thus increasing its accuracy. It appears to be logical that if the range is limited then accuracy should improve. The essential problem however is that as the data base of offenders is narrowed into smaller homogeneous groups, the available pool sometimes becomes too small for effective instrument development. Also when the base rate of occurrence for that specific behavior is low, it is also very difficult to obtain substantial levels of variance explained. For this direction to be taken then, a very large data set is needed and good predictors of the criterion.
Some offenders, within probation system, are so different from the majority of the offender population that the current instruments simply cannot be adequate to the task of classification decisions. It thus might be possible to graft in new measures as specialized scales but this would cause problems with the current management information system.
Young first time offenders pose a special problem for the instruments. Most of the instruments rely heavily on criminal history items, yet these offenders are just possibly starting their criminal careers. The use of criminal history items, which normally are predictive, is thus of no avail. These offenders are just crossing that boundary between adolescence and adulthood they are living at home. Many of the social stability factors are thus also not appropriate. First time offender scales are thus warranted to make more accurate classification decisions.
This category of offenders of course has a very low base rate in reference to the general offender population. It is a category of offenders that the original scales could not even try to model for and thus they are invisible to the instruments. This transparency to the current instruments has led to some criticisms of the established scales. Not only is the base rate low but difficulties in the normal processing of these cases through the criminal justice system further compound the problem of criminal history items related to the behavior. The notification to the police is low, the apprehension rate is low and the convictions are very difficult to obtain. A specialized scale for this offender type is however needed.
Driving while intoxicated cases is
another difficult class of offenders. One
only has to look at the extreme variation in
the numbers of offenders under
supervision, even in contiguous
jurisdictions
, to realize that the
CRIME TYPES
VICTIM SPECIFIC
Murder
Assault
Rape
GENERAL VICTIM
Bombing
Terrorism
THEFT FOCUSED
Larceny
Fraud
Embezzlement
THEFT FOCUSED BUT
DANGEROUS BEHAVIOR
Robbery
Burglary with a weapon
Kidnapping
PSYCHIC COMPELLED
Pedophile
Exhibitionists
Fire Setters
DISREGARD FOR CONSEQUENCES
DWI
Drug Sales
Use/Abuse and Sell
Sell - No criminal orientation
Sell - with criminal
orientation
8 The Types of Crimes - As grouped in regard to their relationship to the victimoffense is highly dependent upon societal variables. Without the need to personally drive, the level of this offense diminishes greatly. In the jurisdictions where they receive first time offenders for supervision, an instrument that can discern those starting a DWI offense history pattern would be very helpful because many of the current instruments do not apply. The current instruments are heavily dependent upon prior criminal history items and social stability items. They will not identify potential offenders starting an offense string in this area because they do not yet have an offense pattern and many are from a solid middle class background.
As noted in the accompanying chart, various types of crimes differ remarkably from each other. Some require a specific victim. Others do not. Some are only interested in obtaining money but pose minimal threat to the physical safety of a human. Others show a complete disregard for the humans involved. Some do not appear to be rational behavior but are driven. Some exhibit a disregard of the consequences of their behavior. All these offenses are remarkably different and so are the offenders that specialize in them. Yet the current instruments assume that the forces that drive all the individuals are the same.
Criminal history items continue to be very predictive of future criminal behavior. As noted previously however they currently add very little to our knowledge base of criminal behavor other than that the criminal behavior will continue. This measure of continued dysfunctional behavior, once it has been established however is crudely modeled compared to how humans view the same information. Sometimes the mere modeling of the record will lead to different assumptions because of it. For example, studies that discern racial biases considering only presence or absence of prior record yield different results, than studies that control for prior record. In the simple yes and no modelling of sentencing a racial bias will be proved but with modeling for the depth and extent, the apparent bias will disappear.
The current modeling of criminal history items is crude, yet this one item area is an excellent predictor of future criminal behavior. How then can improvements be made in the modelling of this important area?
Some theoretical steps have already been taken in this regard. For example the Salient Factor Score, of 1981, builds a crime free period into the risk model and so does a number of other scales.
Prior Record Yes/No
Count offenses
Count types of offenses
Density function
Slope of the offenses
Slope / Type
Slope / Type / Density
9 The organization of the criminal history variable by the depth of analysisAs the accompanying text box notes there is a progression of depth concerning how criminal history items could be organized within the new instruments. The simplest method is just to include a yes and no decision for the presence or absence of a prior record. The next level includes some indication of the number of prior offenses. A higher level not only notes the number of prior offenses but also the types of offenses. This type usually corresponds to the number of misdemeanor and felony offenses. Of course a more elaborate measure would discriminate also concerning the various types of offenses beyond simply their criminal justice system processing weight, of felony and misdemeanor. Density functions add another dimension to the view of the information that is also significant improvement. Humans however can also see time relationships in the offense pattern and the slope of the pattern.
For the instruments and studies to really process this information on a level near that of the human mind, then a measure noting the slope, densities and type of offenses will have to be developed. Significant improvements in the modeling of criminal history items, as noted above, will yield a quantum leap in the ability of the instruments to predict. It will also aid research studies by allowing us to understand subtle differences in the criminal history record better.
If we examine the hypothetical history of John A and John B, the above statements become clearer. In the accompanying text box, both of the offenders' histories are summarized. Both offenders have exactly the same number of offenses, periods of incarceration and periods of employment during the decade. In the case of John A he started with a significant offense history but has mellowed. Other than his previous problems he looks like a stable citizen. He has had no other offenses but the current larceny in almost half a decade.
In the case of John B, he started out all right but then something went very wrong in his life and he now appears to be a threat. This current offense has apparently occurred just after a release from a period of incarceration for robbery. That previous robbery offense apparently occurred just after his release from serving a term of incarceration for Burglary.
As noted previously, both men have exactly the same records over the decade. If a measure of a crime free period for one or two years was applied, that would also not discriminate them. Both men have had no other arrests within a two year period. Yet the men appear remarkably different because of the slope of the offense pattern.
Now suppose you are an Assistant District Attorney (ADA)
within a local criminal court. The initial charge is a felony but it looks
as if it will be reduced. It is a weak case and like most
misdemeanors
the deal will be arranged long before the pre
sentence investigation is completed by the probation department. In
fact you have more information, for some reason, then most ADA's.
You know the periods of employment in addition to the criminal
record.
Now suppose the charge occurred in the following manner. The victim was in a crowded elevator with the defendant. You also know that a one hundred dollar bill was sticking out of the victim's jacket that was over his arm. The defendant apparently believed that anyone so careless with his money should lose it and it probably would have fallen out later anyway and someone else would pick it up. So in actuality, strong criminal intent was really not present. In both cases the state's case is weak because the men will claim it had already fallen out and they then retrieved it. Both are thus willing to enter pleas to criminal possession of stolen property as a misdemeanor but not the felony charge of taking it off the person of another.
Now you're still the ADA and you have a misdemeanor
conviction. This conviction can yield anything from an adjournment in
contemplation of dismissal to a year in jail. In reality both men have
been through the system before. Both have done some state time.
The range of possible sentences is thus from a Conditional Discharge
and a Fine, to a six month period of incarceration.
In all probability
in the case of John A the ADA would go for a maximum of a three
year period of probation with the option open for less if the PSI
showed he was really a good citizen now. In the case of John B it
would be probable to be pushing for a firm, quick
three to six month
period of incarceration. Maybe two months shock probation.
Now suppose you are the probation officer and both men have received probation. Without some measure of the slope of the offense pattern the men would still appear similar in an instrument's view. Their performance on probation, in all probability, will be different because different forces are acting upon their lives.
This is an area of computer science that seeks to develop computers which function as human intelligence does and yield an expert system. Such commercial applications have only been available since the early 1980's. They were not available when the current instruments were developed.
Two prime components are needed for the system. The first
part contains a knowledge base and rules of the system. An
inference engine then selects and executes the rules. The result is
an expert system that is capable of considering a large volume of
knowledge and then recommending a course of action. By using a
large knowledge base, the need for specific instruments for specific
offender types
may be eliminated and the accuracy of the
instruments should be improved.
Artificial intelligence is a tool that has yet to be used in the
development of a risk instrument. It is based on the assumption that
human reasoning can be mimicked, human reasoning can be defined
in rules and that these rules can be structured in a computer. All that
is needed for its development, in the risk instrument field, is the belief
that established statistical predictive results can be merged with an
expert officer's knowledge to yield an expert system. Such a system
should, in theory be superior to the instruments now in use. For as
Gotttfredson noted, "prediction devices, developed by any method,
can do no more than summarize experience."
Such systems can consider mountains of information and
correlate offense specific behavior and environmental factors quite
easily. This correlation is based upon experience and is an individual
diagnosis based upon prior experience. Such a system can marry
the clinicians' expertise and established base rates. It can thus
overcome the most significant error made by clinicians.
probably the most common and surely the most significant error made by clinicians in predicting violent behavior is the ignoring of information regarding the statistical base rate of violence in the population in question.
They can however be expensive to build. Such a system could take
years and millions of dollars. One prime impediment to the
development of such specific systems, according to Paul Harmon a
consultant and editor of Expert System Strategies, are that "many
experts don't consciously understand how they do what they do."
An interesting example of such a system was the attempt to
mimic the expertise of a 55 year old civil engineer, who spent two
decades becoming the expert diagnostician of a one mile long dirt
and gravel dam in the West.
The owning company depended on the
engineer's expertise, to prevent the failure of the damn and they
worried about their vulnerability if they lost the engineer. It was listed
as a prime example, of mixed success "... into the trials that confront
those who try to invest a machine with human reasoning."
Eventually $300,000 was spent and the program was a modest
success. It was not used very much because the engineer was still
on duty and the owning company did not wish to spend the estimated
$100,000 more to generalize the computer program prediction to
other dams as well. It did however point the way to the further
development of expert systems.
An expert system would also have managerial uses far beyond the determination of risk. Currently there is a need in the field for a structured interview device. Many new officers are unsure what they should be looking for during the initial meetings and they are unsure regarding how to obtain that information. Structured interview devices solve both problems and also aid the officer in establishing a plan of action for the offender's case.
The management information system developed in Wisconsin, as noted previously, also contained a component for determining strategies for case supervision. This part of the system was not essential for the management information system and it met with much line officer resistance and little success nationally. While the newer officers found it interesting, the more seasoned officers found it less than useful. In practice the scoring was difficult and time consuming. The theoretical categories it yielded were also unrelated to any established field or expertise. It was however a structured interview device that yielded the agency's policy response and that in itself appealed to many administrators.
The implementation costs of the strategies for case supervision device were high, however. Specialized trainers had to be developed in the department and they required a minimum of 80 hours of instruction. The individual officers then required 40 hours of instruction. In a department with 100 officers and a total cost per officer, including fringe benefits, of $1,000.00 per week and it can be seen that the cost of implementation is $100,000 in just salary expenses. Of course recurring training costs, specialized forms, transportation and lodging costs for the trainers would also be added to this figure.
Expert systems can consider many more variables because the questions follow a branching tree structure. This can be very useful in determining possible outcomes. For example if we assume that someone is more likely to engage in criminal behavior if they have less to lose, then we could develop a tree structure to pursue it. Suppose our questions are related to residential stability and the loss of the residence if apprehended and incarcerated. If we just examine if someone has lived in the same location for a year or more, we can assume something about stability and friendship patterns but no more. If we knew it was just a furnished room, that he lived there alone and hated it, we know some more. If we know it is an unfurnished room and he just finished buying new furniture and he has many friends in the area, we have a different impression. If we know it is a small two family house and his invalid mother also lives there, we also have more information. If he also tells us that it is tough supporting it but he believes that it is an excellent long term investment, we can be reasonably sure that he will not take a chance, criminally, unless the possibilities are huge and the apprehension chances are almost non existent. An expert system pursues the line of questioning to pursue more relevant information. Paper and pencil instruments cannot because they must account for every possible response with a weighted box. In our above residence example, a response such as being homeless would not trigger any further questioning in that area.
An expert system would not only consider many more variables than a simple paper and pencil risk instrument could but it could also be tied into the development of unique recommended agency plans for that offender. This would not only allow classification methods which yield resource allocation and budget justification effects but would add policy implementation stability concerning the agency responses to different offenders. In a field where administrators complain about the different operating philosophies of officers and the different skill levels of officers, an expert system has the potential to be the administrators dream come true.
The literature however has been consistent in demonstrating
that demographic variables focused on the individual are not
sufficient to make accurate predictions in the area of antisocial
behavior. The call for social context and environmental variables
to
be included is thus appropriate and should, if implemented, lead to
better instruments. The use of social variables however will require a
new focus, for this information is not currently deemed relevant in a
system where only the offender is to blame.
The risk instruments by their very focus on variables confined to the individual assume that the reason there is criminal behavior is that something is wrong with the individual and that this difference can be determined by the instrument. They do not assume that there is anything wrong with the area that the person is living in or his peers, only the person is suspect. Any reference to societal issues seems to be avoided. No definitive explanation can be stated because no stated theory of crime generation exists in the community corrections arena, which is dominated by social casework practice. Certainly we know that the community corrections populations are more concentrated in certain census tracts. To say however that you are more in danger of committing a criminal offense if you live in census track so and so would have policy implications beyond the instruments and the unit of government which they serve. Such information would however be helpful for classification and research purposes. Such changes would require a change in the belief structure of most of the probation agencies, however.
While it does not appear that any new risk assessment instruments will be sweeping the nation shortly and that no such instruments are currently being developed, there are some interesting developments taking place. These events will probably change the way we view the development of criminality and how we might intervene to correct it. This information will certainly effect the design and item selection for all the next generation instruments.
The May/June 1990 bimonthly issue of the National Institute
of Justice noted a massive longitudinal study will be underway to
determine what causes and sustains both negative and positive
behaviors.
The premise of this study is that both types of behavior
develop over a period of time. Longitudinal studies will thus be used
with overlapping age groups will permit the program to simulate a 21
year cohort in less than 5 years.
In 1988 A book entitled "Understanding and Controlling
Crime" won the award for Distinguished Scholarship of the
Criminology Section of the American Sociological Association. This
book
by David Farrington Ohlin and Wilson was a direct outgrowth of
the Justice Program Study Group, appointed by the John D. and
Catherine T. MacArthur Foundation in 1982. The members of this
group are renowned
. This massive study is a direct outgrowth of the
point which was argued by the book. That point was that a better
understanding of "how predatory adult criminality develops would
require long-term longitudinal studies of human development from
birth to age 25."
The perspective of the study is also to be unique,
for it will integrate sociological, biological and behavioral viewpoints.
This study will certainly effect the design and item selection for all the
next generation instruments.
Some agencies may not be in a position to wait for the results of the massive research noted above. For the immediate time period then certain specific controls should be put in place. No further grants should be given to revise, revalidate or develop new risk instruments, unless three conditions are met.
1. Public review of the criteria for variable selection must be mandated.
2. Public review of the criterion of what is dangerous must mandated.
3. Some academic, scholarly institution must be directly affiliated with the work and be willing to report and defend the findings of the research to the academic community.
Ashby, Lord, FRS. "The Risk Equation - The Subjective Side
of Assessing Risks," New Scientist 74 (May 19 1977):
398-400.
Baird, S.; R. Heinz; and B. Bemus, "The Wisconsin Case
Classification / Staff Development Project: A Two Year Follow-Up Report," in Classification, by the
American Correctional Association. College Park
Maryland: American Correctional Association, 1981.
Baird, S.; and D. Lerner, "A Survey of the Use of
Guidlines and Risk Assessments by State Parole
Boards" [Draft Report completed for the California
Youthful Offender Parole Board]. San Francisco,
California: National Council on Crime and
Delinquency, 1985. Photocopied.
Baird, S. Christopher, "Probation and Parole
Classification: The Wisconsin Model," Boulder,
Colorado: National Institute of Corrections
Information Center, 1985. Photocopied.
Baird, S. Christopher, Interview by author, 31 August
1988, American Probation and Parole Officers
Conference, Cincinnati Ohio.
Bemus, Brian, Interview by author, 5 August 1987, American
Probation and Parole Officers Conference, Baltimore
Maryland.
Bemus, Brian, Interview by author, 31 August 1988,
American Probation and Parole Officers Conference,
Cincinnati Ohio.
Bradshaw, Richard. "Multivariate Acturial Prediction Of
Felonious Recidivism Of Male Parolees: Development
and Cross-Validation Of A Series Of Risk Assessment
Models Using Stepwise Logistic Regression," Ph.D
diss., Michigan State University, 1987.
Chaiken, Maricia., and Jan Chaiken. "Offender Types and
Public Policy," Crime and Delinquency 30 (April
1984): 195-226.
Clear, Tod., and Kenneth Gallagher. "Screening Devices In
Probation and Parole: Management Problems,"
Evaluation Review 7 (April 1983): 217-234.
Clear, Tod., and Kenneth Gallagher. "Probation and Parole
Supervision: A Review of Current Classification
Practices," Crime and Delinquency 31 (July 1985):
423-444.
Clear, Tod. "Statistical Prediction In Corrections," in
Research In Corrections, ed. Joan Petersilia.
Washington, D.C.: National Institute of Corrections,
March, 1988, 1-39.
Douglas, Mary. Risk Acceptability According to the Social
Sciences. New York: Russell Sage Foundation, 1985.
Dunster, John. "The Risk Equation - Virtue In Compromise,"
New Scientist 74 (May 26 1977): 454-456.
Evjen, Victor H. "Current Thinking on Parole Prediction
Tables," Crime and Delinquency 3 (July 1962):
215-238.
Farrington, David., and Roger Tarling, eds., Prediction In
Criminology. Albany: State University of New York
Press, 1985.
Farrington, David, Lloyd Ohlin, and James Q. Wilson.
Understanding And Controlling Crime: Toward A New
Research Stategy. New York: Springer-Verlag, 1988.
Fischhoff, Baruch, and Others. Acceptable Risk. New York:
Cambridge University Press, 1981.
Glaser, Daniel, ed. Handbook of Criminology. Chicago: Rand
McNally, 1974.
Gottfredson, Don. "Assessment And Prediction Methods In
Crime And Delinquency," in Task Force Report Juvenile
Delinquency and Youth Crime. The Presidents
Commission on Law Enforcement and Administration of
Justice. Washington, D.C.: GPO, 1967. 171-187.
Gottfredson, Don., Michael Gottfredson, and James
Garofalo. "Time Served In Prison And Parole Outcomes
Among Parolee Risk Categories," Journal of Criminal
Justice 5 (Spring 1977): 1-12.
Gottfredson, Stephen, and Don Gottfredson. "Screening for
Risk: A Comparison of Methods," Criminal Justice And
Behavior 7 (September 1980): 315-329.
Gottfredson, Don, Interview by author, 22 May 1989,
Rutgers University, Newark New Jersey.
Greenwood, Peter W. Selective Incapacitation. Santa
Monica, California: Rand Corporation, 1982.
Greenwood, Peter. W. and Susan Turner. Selective
Incapacitation Revisited: Why the High-Rate Offenders
Are Hard to Predict. Santa Monica, California: Rand
Corporation, 1987.
Hakeem, Michael. "Prediction of Criminality," Federal
Probation 9 (July-September 1945): 31-38.
Hoffman, Peter., and James Beck. "Parole Decision-Making:
A Salient Factor Score," Journal of Criminal Justice
2 (1974): 195-206.
Hoffman, Peter and James Beck. "Salient Factor Score
Validation: A 1972 Release Cohort," Journal of
Criminal Justice 4 (1976): 69-76.
Hoffman, Peter and Barbra Stone-Meierhoefer. "Post Release
Arrest Experiences Of Federal Prisoners: A Six Year
Follow-Up," Journal of Criminal Justice 7 (Fall
1979): 194-216.
Hoffman, Peter and Sheldon Adelberg. "The Salient Factor
Score: A Nontechnical Overview," Federal Probation 44
(March 1980): 44-52
Hoffman, Peter and Barbra Stone-Meierhoefer. "Reporting
Recidivism Rates: The Criterion and Follow-Up
Issues," Journal of Criminal Justice 8 (1980): 53-60
Hoffman, Peter and James Beck. "Revalidating the Salient
Factor Score: A Research Note," Journal of Criminal
Justice 8 (1980): 185-188.
Hoffman, Peter. "Screening For Risk: A Revised Salient
Factor Score (SFS 81)," Journal of Criminal Justice
11 (1983): 539-548.
Hoffman, Peter and James Beck. "Recidivism Among Released
Federal Prisoners: Salient Factor Scores and
Five-Year Follow-Up," Criminal Justice and Behavior
12 (December 1985): 501-509.
Houston, Tom. "Why Models Go Wrong," Byte The Small
Systems Journal 10 (November 1985): 151-164.
Klein, Stephen P. and Michael Caggiano. The Prevalence,
Predictability, and Policy Implications of
Recidivism. Santa Monica, California: Rand
Corporation, 1986.
Kletz, Trevor A. "The Risk Equation - What Risks Should We
Run," New Scientist 74 (May 12 1977): 320-322.
Kroll, Jerome and Thomas Mackenzie. "When Psychiatrists
Are Liable: Risk Management and Violent Patients,"
Hospital and Community Psychiatry 34 (January 1983):
29-36.
Larimore, Wallace and Raman Mehra. "The Problem of
Overfitting Data," Byte The Small Systems Journal
10 (Nov 1985): 167-180.
McAnany, Patrick D, Doug Thompson and Davis Fogel.
Probationand Justice: Reconsideration of Mission.
Cambridge, Massachusetts: Oelgeschlager, Gunn and
Hain, 1984.
Michigan Department of Management and Budget, Office of
Management and Budget. Developing A Model For
Predicting Probation Outcome: A Report To The
Department of Corrections. Lansing, Michigan:
Michigan Department of Management and Budget, 1986.
Photocopied.
Monahan, John. Predicting Violent Behavior: An Assessment
of Clinical Techniques. Beverly Hills: Sage
Publications, 1981.
National Institute of Corrections. Classification In
Probation and Parole: A Model Systems Approach.
Boulder, Colorado: National Institute of Corrections,
1981. Photocopied.
National Institute of Corrections. Classification In
Probation and Parole: A Model Systems Approach,
Supplemental Report: The Client Management
Classification System. Boulder, Colorado: National
Institute of Corrections, 1981. Photocopied.
New York State Division of Probation. Intensive
Supervision Program Evaluators Report #1: Validation
of The Risk Assessment. Albany, New York: New York
State Division of Probation, 1979. Photocopied.
New York State Division of Probation. Intensive
Supervision Program Evaluators Report #2: Validation
of Risk Instrument for Misdeameant Probationers.
Albany, New York: New York State Division of
Probation, 1979. Photocopied.
New York State Division of Probation. Intensive
Supervision Program Evaluators Report #4: Potential
Expansion Of ISP: Similarities Between ISP and
Incarcerated Offenders. Albany, New York: New York
State Division of Probation, 1979. Photocopied.
New York State Division of Probation. Intensive
Supervision Program Evaluators Report #5: Preliminary
Impact Evaluation Of The ISP. Albany, New York: New
York State Division of Probation, 1979. Photocopied.
Pierson, Arthur. "A Critical Examination Of Social Work
Models Using The Sociology Of Knowledge," Ph.D.
diss., Department of Sociology, Fordham University,
1981.
Petersilia, Joan. The Infuence of Criminal Justice
Research. Santa Monica, California: Rand Corporation,
1987.
Petersilia, Joan. Expanding Options for Criminal
Sentencing. Santa Monica, California: Rand
Corporation, 1987.
Petersilia, J, P. Greenwood and M. Lavin. Criminal
Careers Of Habitual Felons. Santa Monica, California:
Rand Corp, 1977.
Petersilia, Joan and Susan Turner. Guidline-Based Justice
The Implications for Racial Minorities. Santa Monica,
California: Rand Corporation, 1985.
Petersilia, Joan and Susan Turner. Prison versus Probation
in California: Impliations for Crime and Offender
Recidivism. Santa Monica, California: Rand
Corporation, 1986.
Petersilia, Joan, S. Turner and J. Peterson. Granting
Felons Probation: Public Risks and Alternatives.
Santa Monica, California: Rand Corp, 1985.
Petersilia, Joan., ed. Research in Corrections.
Washington, D.C. : National Institute of Corrections
and the Robert J. Kutak Foundation, 1988.
Radzinowicz, Sir Leon and Marvin E. Wolfgang, eds. Crime
and Justice. New York: Basic Books, 1977.
Rescher, Nicholas. Risk: A Philosophical Introduction to
the Theory of Risk Evaluation and Measurement.
Washington D.C: University Press of America, 1983.
Schwing, Richard and Albers, A. eds. Societal Risk
Assessment: How Safe is Safe Enough. New York-London:
Plenum Press, 1980.
Shah, Suleem. "Dangerousness: A Paradigm for Exploring
Some Issues in Law and Psychology," American
Psychologist 33 (March 1978): 224-238.
Short, James. "The 1984 Presidential Address: The Social
Fabric at Risk: Toward the Social Transformation of
Risk Analysis," American Sociological Review 49
(December 1984): 711-725.
Schmidt, Peter and Ann Witte. Predicting Recidivism Using
Survival Models. New York: Springer-Verlag, 1988.
Silberman, Charles. Criminal Violence, Criminal Justice.
New York: Random House, 1978.
Solomon, Larry and S. Baird. "Classification: Past
Failures, Future Potential," Corrections Today
(May/June 1981): 4-7.
Starr, C, R. Rudman, and C. Whipple. "Philosophical
Basis For Risk Analysis," Annual Review of Energy 1
(1976): 629-662.
Steadman, Henry J. "The Right Not To Be False Positive:
Problems In The Application Of The Dangerous
Standard," Psychiaric quarterly 52 (Summer 1980):
84-99.
Steadman, Henry J and Joseph P. Morrissey. "The
Statistical Prediction of Violent Behavior," Law and
Human behavior 5 (1981): 263-274.
Steadman, Henry J. "Predicting Violent Behavior: A Note On
A Cross-Validation Study," Social Forces 61 (December
1982): 475-483.
Steadman, Henry J. "A Situational Approach To Violence,"
International Journal of Law and Psychiatry 5 (1982):
171-186.
Taxman, Faye. "Needs and Risk Classifications: An Analysis
Of How They Interact In A Probation Setting," Ph.D
diss., Rutgers University, 1982.
Tittle, Charles. "Social Class and Criminal Behavior: A
Critique of Theoretical Foundation," Social Forces 62
(December 1983): 334-357.
Timko, Francis M. "Felony Risk Assessment: How Good Is Our
Tool," Journal of Probation and Parole 16 (Fall
1984): 30-35.
Toby, Jackson. "An Evaluation of Early Identification and
Intensive Treatment Programs for Predelinquents,"
Social Problems 13 (Fall 1965): 160-175.
Tonry, Michael and Norval Morris, eds., Crime and Justice:
A Review of Research. Vol. 9, Prediction and
Classification: Criminal Justice Decision Making.
Chicago: The University of Chicago Press, 1987.
U.S Department of Health, Education, and Welfare. The
Violent Offender, by D. Glaser, D. Kenefick, and V.
O'Leary. Washington D.C.: GPO, 1966.
U.S Department of Justice, Law Enforcement Assistance
Administration. Promising Strategies in Probation and
Parole. by E. Kim Nelson, Howard Ohmart and Nora
Harlow. Washington, D.C: U.S Department of Justice,
1978.
U.S Department of Justice, Law Enforcement Assistance
Administration. Classification For Parole Decision
Policy. by Donald M. Gottfredson, and others.
Washington, D.C: U.S Department of Justice, 1978.
U.S. Comptroller General of the United States. Probation
and Parole Activities Need To Be Better Managed.
[Report to Congress 21 October]. Washington, D.C.:
U.S Comptroller General of the United States, 1977.
U.S Department of Justice, National Institute of
Corrections. Screening For Risk: A Comparison of
Methods. by Stephen D. Gottfredson and Don M.
Gottfredson. Washington, D.C.: U.S Department of
Justice, 1979.
U.S Department of Justice, National Institute of
Corrections. Vol 2. Probation/Parole Supervision:
Classification Instruments for Criminal Justice
Decisions, by Marvin Bohnstedt [Proj. Dir.] and Saul
Geiser [NCCD Staff Dir]. Washington, D.C.: U.S
Department of Justice, June 1979.
U.S Department of Justice, National Institute of
Corrections. Workload Measures for Probation and
Parole. by Brian Bemus [Proj. Dir.], Gary Arling and
Peter Quigley. Washington, D.C.: U.S Department of
Justice, May 1983.
U.S Department of Justice, National Institute of
Corrections. Management Strategies For Probation In
An Era of Limits. by Nora Harlow and E. Kim Nelson.
Washington, D.C.: U.S Department of Justice, March
1982.
U.S Department of Justice, National Institute of
Corrections, NIC Technical Assistance Report. Model
Probation/Parole Management Program. Washington,
D.C.: U.S Department of Justice, Sept. 1981.
U.S Department of Justice, Office of Justice Programs,
Bureau of Justice Statistics. Sourcebook of Criminal
Justice Statistics. Washington, D.C.: GPO, 1989.
U.S Defense Systems Management College. Risk Assessment
Techniques. Washington, D.C.: GPO, 1983.
Vera Institute of Justice. Felony Arrests: Their
Prosecution and Disposition in New York Cit's Courts.
New York: The Vera Institute of Justice, 1977.
Wisconsin Division of Corrections, Case
Classification/Staff Deployment Project, Bureau of
Probation and Parole, Division of Corrections.
Project Report 2: Development of the Wisconsin Risk
Assessment Scale. Madison, Wisconsin: Wisconsin
Division of Corrections, 1976.
Wisconsin Division of Corrections, Case
Classification/Staff Deployment Project, Bureau of
Probation and Parole, Division of Corrections.
Project Report 3: Results of The Agent Time Study -
Western Region. Madison, Wisconsin: Wisconsin
Division of Corrections, 1976.
Wisconsin Division of Corrections, Case
Classification/Staff Deployment Project, Bureau of
Probation and Parole, Division of Corrections.
Project Report 9: Staffing by Workload: 1979-81
Biennial Buget, Madison, Wisconsin: Wisconsin
Division of Corrections, 1981.
Wisconsin Division of Corrections, Case
Classification/Staff Deployment Project, Bureau of
Probation and Parole, Division of Corrections.
Project Report 14: A Two Year Follow-Up Report. by S.
Christopher Baird [Research Director], Richard Heinz
[Planning Analyst], Brian Bemus [Research Analyst].
Madison, Wisconsin: Wisconsin Division of
Corrections, 1979.
Wisconsin Division of Corrections, Case
Classification/Staff Deployment Project, Bureau of
Probation and Parole, Division of Corrections.
Project Report 15: Field Supervisor Time Study.
Madison, Wisconsin: Wisconsin Division of
Corrections, 1979.
Wisconsin Division of Corrections, Case
Classification/Staff Deployment Project, Bureau of
Probation and Parole, Division of Corrections. 1982
Time Study (Report To The National Institute of
Corrections. Madison, Wisconsin: Wisconsin Division
of Corrections, 1983.
Wright, K., T. Clear and P. Dickson. "Universal
Applicability of Probation Risk-Assessment
Instruments: A Critique," Criminology 22 (February
1984): 113-134.
Wright, Kevin. "The Relationship of Risk, Needs, and
Personality Classification Systems and Prison
Adjustment," Criminal Justice and Behavior 15
(December 1988): 454-471.
VITA
Francis M. Timko, the son of Frank and Mary Timko, was born on August 21, 1944, in Yonkers, New York. After graduating from high school, in 1962, he entered Adams State College in 1963. He then majored in psychology and minored in sociology. He received his Bachelors of Arts degree in 1967 and a "Who's Who In American Colleges and Universities." He then entered the United States Air Force, during the Viet Nam War, and served as a still photographer.
In 1970 he entered The City College of New York. During that time he also taught Elementary School in the Bronx and General Psychology at Westchester Community College. In 1972 he received the Masters of Arts degree in Education and was awarded a license as a School Psychologist. From 1973 until the present time he has been employed by the Westchester County Probation Department. During that time, he has been active with the New York State Probation Officers Association. He has served as Vice President, Regional Vice President and Membership Chairman. He currently is the Chairman of the Standards and Practices Committee of NYSPOA and now manages the Economic Sanctions arm of the department.
In 1975 he wed Christine M. Duro and he also entered Fordham University as a doctoral student. He then came under the mentorship of Dr. Gerald Shadduck.
ABSTRACT
Francis M. Timko
B. A., Adams State College
M. S., The City College
Risk Assessment In Probation Classification: Current State of the Art, Agenda For The Future
Dissertation directed by Gerald Shattuck, Ph.D.
This is a study of the origins, development and current status of Risk Assessment in Probation. It begins an examination of the major risk literature and then proceeds to the major instrument variables and the methods used to assemble present day instruments. A review of the major instruments and their performance is then provided along with the policy implications of their use. The current state of the art is then examined including what questions remain and what possible new directions might be taken, in the development of more advanced instruments.
This study determines that although the field of Risk Assessment has matured from the naive position that it was non political and value free to the realization that it is neither, the probation component still has not advanced to that point. This study demonstrates that the current state of the art in probation systems is not adequate to the task of true risk identification. It was also determined that the explosive growth of risk instruments in probation was due to the managerial needs that gave birth to their rise and that the systems remain because they fulfill managerial objectives. The current risk measures are flawed and outdated but they will not be improved unless managerial needs are also satisfied with any new instruments.