Vol. 1, No. 3 - Education Next

Try, Try Again

John F. Witte — Mon, 09 Nov 2009 00:00:00 +0000

All Together Now:
Creating Middle-Class Schools through Public School Choice
by Richard D. Kahlenberg
Brookings Institution, 2000, $29.95; 379 pages.

As reviewed by John F. Witte

Richard Kahlenberg makes the kind of very clean and uncompromising argument typical of believers in forced desegregation, whether based on racial or, in this case, economic status. The same is often true of promoters of school choice, among both private and, as in this case, public schools. The book not only makes a clear argument but also recalls a very rich political and scholarly period in the history of American education. It brings us back to Brown v. Board of Education and forward to Abbott v. Burke, the New Jersey Supreme Court decision holding that an “adequate” education, as required by the state constitution, requires a certain level of per-pupil funding. It links James Coleman’s seminal report on education and poverty to the latest findings on school choice and to some of the economics literature on educational achievement.

The argument of the book can be stated as a series of propositions, with the material in brackets added by this reviewer:

• The American educational system is considerably segregated by economic class [and race].

• Students in schools with large majorities of disadvantaged students [who are disproportionately likely to be racial minorities] do poorly on measures of educational achievement, and their schools are likely to have higher rates of disciplinary problems.

• In what is known as the “peer effect,” poor students [and minority students] do better in schools where the student body is more middle class [white].

• Therefore, the way to improve the educational achievement of poor [nonwhite] students is to desegregate schools by economic class [race].

• Public-school choice can create a certain socioeconomic balance within schools with minimal conflict.

My use of brackets attempts to emphasize that this argument is something of a reworking of the racial desegregation arguments that prevailed from the 1960s through the 1980s. This is not to imply that the original argument can simply be readjusted in light of the failure of the desegregation movement. This is to emphasize that one cannot simply “find and replace” race with economic class. Sadly, they are not the same.

In Kahlenberg’s opinion, the primary means for achieving economic integration while avoiding the controversy of racial integration is controlled choice. Controlled choice requires all parents to choose schools, often within “subdistricts” that are defined by their racial and economic mix of students. The district then assigns the children, taking into account a range of factors: parental choice, economic integration, racial integration, distance to school, and school enrollments. Ideally, transportation is provided to all students.

The problem with districts in which a majority of the students are low income is the need to add middle-class students. Kahlenberg claims these districts account for only 14 percent of all districts, but this understates the problem: these districts represent considerably more than 14 percent of the nation’s students. In addition, Kahlenberg’s “ideal solution” is district consolidation, which he implies is a viable alternative given our 70-year history of consolidation. However, this would obviously require consolidating inner-city and suburban districts. There is no political comparison between that type of coercive consolidation and the consolidation of small rural districts, and even that was hard-fought in many states. I cannot imagine a legislature in this country authorizing such consolidation unless the districts voluntarily came forward.

A personal note may emphasize the depth of the problem of integrating inner-city and suburban districts. In my first foray into education policy in the mid-1980s, I led a study of the Milwaukee metropolitan public school districts. The commission for which I served as executive director of research was established as an attempt to forestall a lawsuit by the Milwaukee School Board intended to force metropolitan racial integration. The commission failed; the lawsuit went forward and was settled out of court by extending an existing voluntary city-suburbs exchange program.

During the deliberations concerning the commission’s report, I became an advocate of a proposal to divide the Milwaukee School District (100,000 students) into between eight and ten pie-shaped districts extending from the Milwaukee inner-city to the suburbs. The idea was to integrate based on socioeconomic status. In part the rationale was based on a “tipping theory” hypothesis for schools that Kahlenberg supports. I am still convinced that the theory is correct–that when the share of a school’s students who are disadvantaged reaches 60 percent or more, the focus, morale, and educational environment shift dramatically. Teachers have reduced time to teach, are forced to teach many remedial courses, and generally seek to leave; middle-class parents of all races do the same.

What is telling for Kahlenberg’s recommendations is the reception that greeted these proposals. I gave three speeches in the Milwaukee suburbs promoting the plan. The response wasn’t frigid; it was blatantly hostile. No arguments could overcome the hostility, though I explained how not all kids would be bused to the inner-city; how magnet schools would be used (the mid-1980s version of choice); and how extra resources could be provided. After an absolutely dead-on-arrival reception in the legislature, the plan was dropped and has never reemerged in any fashion.

I am not as persuaded as Kahlenberg is by the peer-effect data showing, he claims, that economic integration supports achievement, but I generally agree with his goals and his solution. The problem, unfortunately, is that it is not practical in present-day America. Because of our geographic segregation by class, students will have to be bused and some, if not many, will have to be coercively bused. Busing has never been popular, and there is no reason to believe it will be more agreeable when the focus shifts from race to economic class.

The case studies of economic desegregation in the book’s last chapter are dominated by a city in my state, La Crosse, Wisconsin. I am familiar with that case and a parallel case Kahlenberg mentions in Wausau, Wisconsin. The problem of extending La Crosse as model for the larger problems of inner-city education in America will be obvious to most readers. La Crosse has a wide middle-class majority and to most observers would be mistaken for a suburban school district. The minority population is 12 percent Asian (mostly Hmong) and 3 percent black. Wausau, in the northern, rural area of the state, actually has an even larger Hmong population in the schools, 24 percent. Even with these small minority populations, the conflict generated in these cities over efforts to move students to achieve economic balance in the schools was considerable. In Wausau, as Kahlenberg dutifully notes, the attempt to “integrate” led to the defeat of the school board and the replacement of the district superintendent. The La Crosse effort was more successful, but not without strife.

My point, and I hope it is debated rigorously, is that substituting economic assignment for racial assignment will not fool anyone. Magnet schools failed to save racial desegregation; it’s unclear why, in light of this failure, wider choice options would promote economic desegregation. As the La Crosse plan attests, and as controlled-choice options based on race (such as that in use in Boston) confirm, unless neighborhoods are economically integrated, the only way to facilitate desegregation is through some coercive element, such as quotas. La Crosse is not a large city, yet school population “targets” were still needed. There is no evidence that cities in America are willing to accept forced busing based on either race or economic status.

I share Kahlenberg’s goals, and at one time I favored his solution, but I doubt that his proposal has any chance of working in the near future in America.

-John F. Witte is director of the Robert La Follette School of Public Affairs at the University of Wisconsin-Madison.

As reviewed by John E. Coons

A half-century of court orders has dismantled the legal regimes of racial segregation without achieving much integration in the schools. It was right for the justices to declare the principle of Brown v. Board of Education; perhaps it was right to attempt heroic remedies that would challenge sheer physical separation. However, the law has failed and today is in retreat toward its de jure redoubt.

Richard Kahlenberg attributes much of the law’s failure to its very focus on race. The real problem, he claims, is consistent segregation by family income or wealth. In America a child’s classmates tend to come exclusively from his own social class. Focusing on the incendiary racial hook has distracted us from this more fundamental pathology.

Kahlenberg persuades us that children from the bottom third of society’s economic ladder simply do not prosper academically when they are taught in homogeneous platoons conscripted from the neighborhood. They can learn, however, when some stroke of luck enrolls them in a mostly middle-class school. Afforded the influence of bourgeois peers, they transcend the limits of the urban classroom–and they do so without systematic injury to their middle-class schoolmates; the poor win, and the rich don’t lose. This premise whets the author’s legal appetite for a “right to attend middle-class schools.”

One could question such a right while affirming this general rebuke to the system. The social balkanization created by government schools renders them both inefficient and thoroughly undemocratic. In this country the middle class simply buys the schooling it prefers, shopping for it in the clumsy but effective real-estate market that sells state-run education. But while the middle class maneuvers, the rest of America is herded. Their schools are labeled “public,” but this is a name hijacked from more democratic state enterprises to which all have access. Unlike the street, the library, the public park, or the museum, the school maintained by the state excludes the family that cannot afford to be its neighbor. The result: Beverly Hills and Grosse Pointe are private in all but name. The liberal’s calling is not to reform the public school, but at long last to create it.

Kahlenberg styles his crusade “economic integration,” meaning that there should be a substantial presence of non-rich children in schools that carefully maintain a middle-class majority. He ponders a possible constitutional claim to such an environment, but little in federal doctrine suggests it. Of course, judges (state as well as federal) can have exotic insights, and lawyers will suggest them. Kahlenberg will not object: “Court decisions are less democratic, but . . . the judiciary should promote certain important principles.”

Happily for this reader, the bulk of the book centers not on the courts but on politics and democratic values, at least as the author understands them. Prudent programs of economic integration could, he argues, fetch the allegiance of a middle class that has nothing to lose and a conscience to quiet. The ideal design of any such program, however, proves enigmatic and elusive, most evidently in Kahlenberg’s nervous probes toward parental choice. Repeatedly the book seems poised to consider choice for the poor as a distinct and substantive objective. In the end, however, it provides only scraps of the various theories that might justify choice as an independent value.

All Together Now cannot make these foundational arguments because it views choice merely as one instrument of the grand objective, economic mixing. Hence the book’s mantra is not choice, but that opaque locution, “controlled choice.” Its reports of successful integration involve marginal and attenuated forms of parental freedom. The flagship example is La Crosse, Wisconsin, where the primary means of integration was the redrawing of attendance zones. Kahlenberg is comfortable with coercion of this sort even in his endorsements of charter schools; these should be made available, he says, where choice will contribute to the proper mix of rich and poor.

This helps to explain the book’s neglect or superficial treatment of a generation of proposals that honor both integration and choice as strong yet distinct values to be pursued simultaneously. Chosen integration is substantially more humane and stable than any potpourri achieved by command. The two objectives can be natural allies. Indeed, since 1969 scholarly models of family choice have consistently stressed both values by including rules ensuring that state and participating private schools alike will share in the integration of the social classes. In those cases where the legislative models are designed to make children from all economic levels eligible for vouchers, the means of integration have varied from full and partial admissions lotteries to modest set-asides of a portion (often 20 percent) of a school’s new admissions for low-income applicants. Such admissions policies would be coupled with a rule that either forbids tuition beyond the amount of the state subsidy or requires that any charges be means-tested. The grant or scholarship must also be large enough to stimulate the formation of new private schools without the need for tuition supplements. Otherwise they are useless to the poor once the existing private schools are filled. Finally, the state must ensure appropriate transportation and information for low-income families

The three legislated systems of choice now actually operating (in Milwaukee, Cleveland, and Florida) all deploy these “controls” or their equivalents, even though children from low-income families are their targeted and primary beneficiaries. The programs there are achieving substantial economic integration in private schools and in charter schools that have been created by the states to face the new competition. There is a distinct possibility that, under such programs, even government schools will become authentically public.

In light of this distinctly liberal history and practical experience, why does Kahlenberg repeatedly engage in the unscholarly and misleading bashing of vouchers? For example, “Vouchers have failed because they generally produce greater socioeconomic concentration not less; divert funds to the wealthy; and will further divide Americans by race and religion.” What can this possibly mean? In our entire history, no voucher system has existed other than the three now operating. Is the Milwaukee system, in Kahlenberg’s terms, “a tool by the right wing to undercut public education,” one that diverts money to the wealthy? Quite the contrary. In fact, Wisconsin has provided the poor their first experience of an integrated and truly public education. Moreover, it has shown the families there the fundamental respect that also can justify their trust–and that of their children–in a democratic order.

-John E. Coons is a professor of law emeritus at the University of California at Berkeley.

The post Try, Try Again appeared first on Education Next.

Bowling Together

David Campbell — Wed, 19 Jul 2006 00:00:00 +0000

Illustration by Travis Foster

As the civic participation of young people continues to plummet, it becomes ever more important that we learn how schools can teach students to be active citizens. Indeed, “producing better citizens” was the original justification for creating America’s public schools. In the 1800s, Horace Mann and others successfully argued that the public schools could assimilate immigrants into the norms of American civic life. Today essentially the same objective remains. In a 1996 Phi Delta Kappa/Gallup Poll, 86 percent of Americans reported that they feel “preparing students to be responsible citizens” is a “very important” purpose of the nation’s schools; just 76 percent considered it very important that schools “help people become economically self-sufficient.”

Today a broad consensus exists on three objectives characteristic of an education that develops good citizens. The first objective is to equip the nation’s future voters with the capacity to be engaged in the political process. This is especially salient against a backdrop of declining rates of political activity, most notably among young people. The second objective is to have citizens not only participating in democratic institutions, but also doing so knowledgeably. Students should understand the nation’s history and political system. The third objective stems from political philosopher Amy Gutmann’s idea that the defining characteristic of a democratic education is that it imparts the “ability to deliberate” in a context of “mutual respect among persons.” The general public would seem to agree. According to a 1999 Phi Delta Kappa/Gallup Poll, 93 percent of Americans believe that the schools should teach “acceptance of people of different races and ethnic backgrounds.” Seventy-one percent stated that the schools should also teach “acceptance of people who hold unpopular or controversial political or social views.” In other words, an education that prepares students for democracy teaches them to respect the opinions of others and promotes social and political tolerance.

Different types of schools pursue various strategies for meeting these objectives, some with more success than others. Here I ask the simple question: How well do different types of schools promote civic education? For instance, do private schools foster social divisiveness, as their critics often claim? Does attending a religious private school rather than a secular one have different civic consequences?

Building Social Capital

There is only a small but nonetheless growing body of research on the civic effects of public versus private schools. Researchers have shown that Catholic schools are more racially integrated than public schools and that voucher programs do not have an adverse effect on integration. Another study found that Hispanic adults who were educated in private schools are more likely to participate in politics than those who attended public schools. Evidence from the National Education Longitudinal Study further demonstrates that students in private schools are more likely to participate in community service than are their peers in public schools. Similarly, private school administrators more often rate their schools as “outstanding in promoting citizenship” than do their public school colleagues. Research from the voucher programs in Dayton, Ohio, and Washington, D.C., found that parents of private school students were more civically engaged than parents of public school students.

Strong evidence has accumulated that nonpublic, particularly Catholic, schools are a private means to the very public end of facilitating civic engagement.

In 1966, before school vouchers were on the nation’s political agenda, Andrew Greeley and Peter Rossi used extensive survey data collected from American Catholics to argue that Catholic schools do not depress civic engagement or promote intolerance. The 1998 National Assessment of Educational Progress (NAEP) Civics Report Card for the Nation reports that students in private schools (both Catholic and non-Catholic) have higher average scores on the NAEP civics test than do their peers in public schools. However, this is without adjusting the data for background characteristics that may affect students’ level of civic knowledge, such as their parents’ educational level. James Coleman and Thomas Hoffer did control for family background and found that students in private schools, both Catholic and non-Catholic, scored higher on the High School and Beyond civics test than did public school students, although the results were not statistically significant. Taken together, these results give no reason to suspect that private schools do a worse job of providing a civic education than assigned public schools and some reason to think they do a better job. Yet reasonable doubt remains.

More broadly, Harvard University professor Robert Putnam’s research on civic participation provides reason to think that private schools should be better able to deliver civic education than are public schools. Since the publication of Putnam’s Making Democracy Work (1993) and the follow-up, Bowling Alone (2000), it has become increasingly common for political scientists to discuss political participation as driven by social capital. As Putnam defines it, “social capital refers to features of social organization such as networks, norms, and social trust that facilitate coordination and cooperation for mutual benefit.” Before Putnam employed the concept to explain differences in governmental performance between northern and southern Italy, Coleman and his colleagues developed it to theorize why students in Catholic schools excel academically relative to their public school peers. Perhaps the same characteristics that cause Catholic schools to excel academically enable them to produce better citizens.

Comparing public and private school students is difficult, given that most Americans attend a public school in the elementary and secondary grades. In 1995, 91 percent of American secondary-school students were enrolled in public schools. It is even more difficult to make comparisons within the private sector, which comprises Catholic schools, religious schools sponsored by other faiths, and secular schools. Even within the public sector, there are schools to which students are assigned based on geography and schools they choose to attend (magnet and charter schools, for example). Here I use five types of schools: assigned public, magnet public, Catholic, secular private, and other religious. This last category combines schools from a broad range of faiths, including Christian fundamentalists, Quakers, and Muslims.

I used survey data from the 1996 National Household Education Survey, a large nationally representative survey of both parents and their children. This analysis draws on questions asked of students (and their parents) in grades 9 through 12, for a total sample size of 4,213. Even with a sample this large, the number of students attending other religious and secular private schools is still quite small (80 and 102 cases, respectively).

Throughout the analysis, except where noted, I adjusted the data to account for a variety of factors that might influence students’ civic education and knowledge other than the kind of school they attend. At the individual level, I controlled for the usual demographic factors: students’ age, gender, race, ethnicity, whether they spoke English, and whether they lived in the South. I also controlled for their academic performance, expectations of going to college, their expressed ability to take political action, their interest in the news, and the number of hours they spent at a job. Since adolescents’ family lives exert a strong influence on their values, I controlled for a variety of family-based factors. These included parents’ educational levels, family income, church attendance, whether students grew up in a two-parent household, parents’ volunteer work, parents’ political participation, parents’ civic skills, and parents’ political tolerance. I also considered factors such as the racial composition of a school, school size, whether the school arranges volunteer service, whether students’ opinions matter, whether courses with political content are offered (so-called civics classes), and whether the school has a student government. Owing to the richness of the control variables, there can be reasonable confidence that these results capture the effect of the school a student attends.

Civic Engagement

Voter turnout is the most commonly used measure of civic engagement, but it is inadequate here because very few secondary-school students are old enough to vote. Instead, three other measures can be used. One, voluntary community service, is a measure of what students are doing in the present. The second, acquiring “civic skills” in the classroom, is a measure of what students have learned to prepare them to be civically engaged in the future. The third is a measure of whether students feel confident that they could actually use their civic skills outside of the classroom.

Community Service. Past studies have found civic activity while young to be a “pathway to participation” in adulthood. Students were asked whether they engaged in “any community service activity or volunteer work at your school or in your community.” Without any adjustments to the data, the results show that, statistically, there is no difference between assigned public schools and magnet public schools or secular private schools. However, students in both Catholic and other religious schools are more likely to engage in community service than are students in assigned public schools. Forty-seven percent of assigned public school students perform community service, compared with 64 percent of students in other religious schools and 71 percent of students in Catholic schools. This is not surprising, given the religious character of these schools. Other research has shown that religious people are more inclined to volunteerism.

A reasonable objection to these results is that they are potentially misleading because many religious schools require their students to perform “voluntary” service. Seventy percent of students in Catholic schools report that their schools require community service in the 9th through 12th grades. This compares with 16 percent of students in assigned public schools, 22 percent in magnet public schools, 28 percent in other religious schools, and 38 percent in secular private schools. It is unclear what these differences mean for students’ long-term commitment to community service. On the one hand, students who are compelled to perform service may only grow resentful. On the other hand, requiring service in the community would presumably introduce to volunteer work students who would not otherwise have had that experience. They may find that they enjoy it and wish to continue even after they have fulfilled their school’s requirement. Regardless, after excluding from the analysis those students whose schools required them to perform community service, the results were very similar-with the notable exception of religious/non-Catholic schools. While more of their students participate in community service than do students in assigned public schools, the difference ceases to be statistically significant.

These results, however, still do not account for differences in the backgrounds and characteristics of students who attend these types of schools that might in turn affect whether they engage in community service. After again excluding students whose schools require community service, I took into account the various factors listed above. The results still show that students in Catholic schools are more likely to perform community service than those in assigned public schools (see Figure 1). Forty-eight percent of public school students participate in community service, compared with 59 percent of Catholic-school students. That is, Catholic schools contribute about as much to the likelihood of students’ providing community service as does having a parent or guardian in the home who participates in community service (which also increases the share of students participating in volunteer activity by about 11 percentage points). Catholic schools contribute almost twice as much to a student volunteering as does raising a parent’s educational level from a high-school diploma to a college degree. There is no statistically significant difference between students in assigned public schools and those in secular private, magnet public, or religious/non-Catholic schools.

This suggests an answer to one of the essential questions in the debate over voucher programs-that is, will sending students to private schools harm their civic education? The answer appears to be no; in fact students in Catholic schools are more likely to engage in voluntary service than are students in assigned public schools, and there were no significant differences between students in assigned public schools and those in the three other types of schools. These findings are consistent with the research of Anthony Bryk, Valerie Lee, and Peter Holland, who explored the relationship between Catholic schools and their communities. They report that in all of the Catholic schools they included in their study, community service work was available as an elective course. Furthermore, an ethic of service was frequently found among both the staff and students they interviewed. They write:

These service programs signify Catholic schools’ commitment to a just social community. One board member of a field-site school remarked, “A school should not call itself Catholic if it doesn’t have a volunteer service program.” The director of the program at St. Edward’s [one of their case-study schools] commented: “I’m a believer in service. It’s important for students to realize that the things they do make a difference. We can heal people and make their lives better. We can raise the awareness of others. Physical contact is vital for Christianity. Some of our students are sheltered from poverty and from people of different races. This program is important because it makes them more aware.”

Civic Skills. People differ in their capacity to perform the mundane tasks that constitute virtually all political activity-skills like giving speeches, holding meetings, and writing letters. Those who lack these skills are extremely unlikely to participate in politics. While the authors focused on how adults learn civic skills on the job or through participation in voluntary organizations, it is in school that people are most likely to learn them when young.

An index of civic skills was created to test for systematic differences across the five types of schools. The household survey asked students, During this school year, have you done any of the following things in any class at (your current) school:

• Written a letter to someone you did not know?

• Given a speech or an oral report?

• Taken part in a debate or discussion in which you had to persuade others about your point of view?

Once again, the result for students who attend a Catholic school is statistically significant at the .05 level. Given the litany of control variables included in this analysis, this variable has quite a statistical hurdle to clear to reach statistical significance. Compared with a student who attends an assigned public school, a Catholic-school student learns an average of .13 more civic skills. Not a dramatic difference, but completely consistent with the other findings reported here, it bolsters the evidence that Catholic schools deliver a high-quality civic education.

Civic Confidence. Learning civic skills is one thing; being able to use them is another. The household survey also asked respondents whether they feel that they could use two of the civic skills learned inside the classroom elsewhere. The questions ask:

• Suppose you wanted to write a letter to someone in the government about something that concerned you. Do you feel that you could write a letter that clearly gives your opinion?

• Imagine you went to a community meeting and people were making comments and statements. Do you think that you could make a comment or a statement at a public meeting?

Students in secular private, Catholic, and other religious schools are more likely than students in assigned public schools to have confidence in their ability to exercise civic skills if called upon to do so. Of these three, the religious/non-Catholic-school students display the greatest degree of civic confidence. Civic confidence is the only component of civic education included in this analysis for which each type of private school displays a positive and statistically significant effect. The bottom line is that public schools can take a lesson from private schools about how to prepare students for civic life, not only in providing skills, but also in providing the confidence to use those skills.

Political Knowledge

The second objective of a civic education is to teach future voters specific, factual information about American politics. Indeed, of the three objectives I have listed, this one is most clearly the province of the schools. While there may be disagreement over whether schools should require community service, presumably everyone agrees that schools should require the acquisition of knowledge. Without understanding the particulars of American politics, people are unable to engage fully in the political process. In fact, political scientist John Zaller argues persuasively that factual knowledge about politics is the best measure of political engagement.

Perhaps the same characteristics that cause Catholic schools to excel academically enable them to produce better citizens.

The National Household Education Survey includes a series of factual questions about American politics. Each respondent was asked five of the following ten questions. To avoid contamination effects, whichever five questions a student answered, her parent answered the other five:

• What job or political office is now [in 1996] held by Al Gore?

• Whose responsibility is it to determine if a law is constitutional…the President, the Congress, or the Supreme Court?

• Which party now has the most members in the House of Representatives in Washington?

• How much of a majority is needed for the U.S. Senate and House to override a presidential veto?

• Which of the two major parties is more conservative at the national level?

• What job or political office is now held by Newt Gingrich?

• Whose responsibility is it to nominate judges to the federal courts…the President, the Congress, or the Supreme Court?

• Which party now has the most members in the U.S. Senate?

• What are the first ten amendments of the U.S. Constitution called?

• Which of the two major parties is in favor of the larger defense budget?

Before making any adjustments, the average scores of students in assigned public schools are lower than those in Catholic, religious/non-Catholic, and secular private schools. Students in assigned public schools got an average of 2.4 questions out of five correct, while students in Catholic, religious/non-Catholic, and secular private schools scored an average of 3.2, 3.4, and 3.2 respectively.

Once the statistical adjustments are made for all the factors that can influence students’ political knowledge except the type of school they attend, only students in Catholic schools still perform better than do students in assigned public schools. Therefore, we can conclude that only students in Catholic schools display more political knowledge when accounting for a slew of potentially confounding demographic factors, many of which are themselves statistically and substantively significant. Older students score better on the index, as do males, whites, and non-Hispanics. Having higher grades, expecting to attend college, expressing greater political interest, and spending more time watching or reading the news are all positively related to political knowledge. At the family level, both parents’ educational levels and political knowledge are positive factors predicting students’ greater political knowledge.

Political Tolerance

While all three objectives are equally important components of a civic education, it is the third-respect for opinions different from your own, or political tolerance-that may be most relevant to the debate over the civic consequences of attending private schools. This is often expressed as a concern that private (particularly religious) schools exacerbate social tensions. In the words of the late union leader Al Shanker, widespread voucher programs that send students to private schools “would foster divisions in our society; they would be like setting a time bomb.” Amy Gutmann stresses the need for students in religious schools to be taught democratic norms under direction from the state, presumably fearing that these schools cannot be trusted to provide instruction in a “common democratic character” on their own. Certainly, the concern for teaching a respect for universal civil liberties is well placed, as democracy is defined as much by respect for minority rights as by simple majority rule.

Forty-eight percent of public school students participate in community service, compared with 59 percent of Catholic-school students.

The question of adolescents’ attitudes and how they are related to enrollment in different types of schools has been virtually unexplored. The research literature suggests two alternative hypotheses, drawn from different perspectives on what fosters political tolerance. The first hypothesis is derived from distinguishing between schools as public versus private institutions. By this reasoning, private schools may be thought to foster an exclusivity among their students that translates into a disregard for minority opinions. In particular, religious schools may foster civic divisiveness, a fear reinforced by survey data that show religiosity to be negatively related to political tolerance. The second hypothesis follows from studies showing the positive academic effects of attending a private school. One widely noted study reports that education increases tolerance by enhancing students’ general cognitive proficiency. It would follow that tolerance is greatest in those schools where students display the strongest cognitive performance, generally private schools.

The household survey contains two questions gauging political tolerance:

• If a person wanted to make a speech in your community against churches and religion, should he or she be allowed to speak?

• Suppose a book that most people disapproved of was written, for example, saying that it was all right to take illegal drugs. Should a book like that be kept out of a public library?

The question about churches and religion particularly challenges the political tolerance of students in religious schools, since it confronts them with an opinion they will almost certainly reject. Students in Catholic and secular private schools have higher tolerance scores than students in assigned public schools, averaging 1.6 and 1.8 tolerant responses respectively, compared with 1.4 tolerant responses among assigned public school students. Students in other religious schools have an average score (1.2 tolerant responses) lower than that of public school students. Students in magnet public schools have slightly higher scores than assigned public school students, although the difference does not approach statistical significance.

After again making the statistical adjustments listed above, students in secular private schools scored substantially higher on the political tolerance index than students in assigned public schools, while students in religious/non-Catholic schools scored substantially lower (see Figure 2). Catholic-school students still score higher than assigned public students, though the difference is only a third as large as that between secular private schools and assigned public schools. It is possible that there is some credence to the concern expressed by critics of private education that it has the potential to foster political intolerance. While students in Catholic schools (the most common form of private education) and secular private schools are more politically tolerant than students in assigned public schools, the 2 percent of America’s students in other religious schools-an amalgam of schools sponsored by many different faiths-score lower on the political tolerance index.

One should not necessarily conclude that other religious schools breed intolerance. It may be that students’ attitudes are shaped more by family background than by the schools they attend. For all its virtues, the National Household Education Survey has poor measures of parents’ religious involvement. Because family tolerance is poorly measured and the sample size, for this group of students, is small, we should be cautious in drawing firm conclusions.

Evaluation of responses to survey questions about civil liberties must also take into account the content of the question. Because one of the questions on the tolerance index deals specifically with the rights of a speaker who is opposed to religion, students in religious schools might be expected to be especially wary of granting full freedom of expression. This is not to diminish the importance of respect for religious differences as an important component of political tolerance, but only to suggest that other questions might provide more of a “hard case” for students in secular schools. And lest it be thought that there is something inherent in religious education that breeds intolerance, remember that Catholic-school students show higher levels of tolerance than students in assigned public schools.

Make Democracy Work

The results reported here are consistent with four similar studies-the 1973 High School Seniors Cohort Study, the National Educational Longitudinal Study, the Latino National Political Survey, and data collected from participants in school-choice programs in Washington, D.C., and Dayton, Ohio. Few findings in social science can be replicated in five independent sources of data (six, if you count the Washington and Dayton surveys separately). In short, it seems that strong evidence has accumulated that private-particularly Catholic-schools are a private means to the very public end of facilitating civic engagement.

How “public” is a school in an exclusive suburb with high housing costs, especially when compared with a Catholic school that costs around $2,500 a year?.

This conclusion is admittedly provocative, if only because of the connotations the words “public” and “private” carry in contemporary discourse. In the United States, the word “public” is supposed to refer to the source of a school’s funding and not to the population served by a school. Nevertheless, critics of private education often implicitly extend the limited definition of “public” to mean the population served by the school. Critics speak of high-priced preparatory schools as though they are the only, or at least the most common, type of private education in the United States. Often all private schools are grouped together and caricatured as exclusive and insular. While it is true that privately funded schools have the prerogative to apply virtually any criteria they want for admissions, in practice Catholic schools, at least, are very inclusive.

Catholic high schools are not highly selective in their admissions. The typical school reports accepting 88 percent of the students who apply, and only about a third of the schools maintain a waiting list. Anthony Bryk and his colleagues also report that “religious affiliation is not a routine consideration” in admissions to Catholic schools. Even though Catholic schools charge tuition, 87 percent offer financial aid. By contrast, public schools almost exclusively enroll students who live in the geographic area surrounding the school. How “public” is a school in an exclusive suburb with high housing costs, especially when compared with a Catholic school that offers financial aid to assist with its tuition (which, in turn, is usually only around $2,500 a year)?

Critics also tend to define public schools as the only institutions providing an education that promotes publicly spirited citizens. By this definition, the evidence presented here suggests that Catholic and private secular schools are really more “public” than schools funded by the state. The claim that private organizations can contribute to the quality of a community’s public life is hardly original. Echoing Alexis de Tocqueville’s Democracy in America, Putnam found that measures of civic associational life, everything from choral societies to soccer clubs, are the primary explanation for effective governance across Italy’s regions. Voluntary associations like these produce social capital, and social capital “makes democracy work.” In fact, our public schools seem to have much to learn from private, especially Catholic, schools about what makes democratic education work.

-David E. Campbell is a Ph.D. candidate in government and a research associate of the Program on Education Policy and Governance at Harvard University.

The post Bowling Together appeared first on Education Next.

Sciencephobia

Thomas D. Cook — Wed, 19 Jul 2006 00:00:00 +0000

Photograph by Image 100

The American education system, uniquely decentralized among industrial nations, has been continually roiled by tides of local experimentation, especially during the past 20 years. The spread of whole-school reform models such as Success for All; the imposition of standards and high-stakes tests; the lowering of class sizes and slicing of schools into smaller, independent academies; the explosion of charter schools and push for school vouchers–all these reforms signal a vibrantly democratic school system.

Experimentation, however, means more than simply changing the way we do things. It also means systematically evaluating these alternatives. To scholars, experimentation further suggests: 1) conducting studies in laboratories where external factors can be controlled in order to relate cause more directly to effect; or 2) randomly choosing which schools, classrooms, or students will be exposed to a reform and which will be exposed to the alternative with which the reform is to be compared. When well executed, random assignment serves to rule out the possibility that any post-reform differences observed between the treatment and control groups are actually due to pre-existing differences between the two groups rather than to the effects of the reform. The superiority of random assignment for drawing conclusions about cause and effect in nonlaboratory settings is routinely recognized in both the philosophy of science literature and in methods texts in health, public health, agriculture, statistics, microeconomics, psychology, and those parts of political science and sociology that deal with improving the assessment of public opinion.

Since most education research must take place in actual school settings, random assignment would seem to be a highly appropriate research tool. However, though the American education system prizes experimentation in the sense of trying many new things, it does not promote experimentation in the sense of using random assignment to assess how effective these new things are. One review showed that not even 1 percent of dissertations in education or of the studies archived in ERIC Abstracts involved randomized experiments. A casual review of back issues of the premier journals in the field, such as the American Educational Research Journal or Educational Evaluation and Policy Analysis, tells a similar story. Responding to my query, a nationally recognized colleague who designs and evaluates curricula replied that in her area randomized experiments are extremely rare, adding, “You can’t get districts to randomize or partially adopt after a short pilot phase because all parents would be outraged.”

Very few of the major reform proposals currently on the national agenda have been subjected to experimental scrutiny. I know of no randomized evaluations of standards setting. The “effective schools” literature includes no experiments in which the supposedly effective school practices were randomly used in some schools and withheld from others. Recent studies of whole-school-reform programs and school management have included only two randomized experiments, both on James Comer’s School Development Program, which means that the effects of Catholic schools, Henry Levin’s Accelerated Schools program, or Total Quality Management have never been investigated using experimental techniques. School vouchers are a partial exception to the rule; attempts have been made to evaluate one publicly funded and three privately funded programs using randomized experiments. Charter schools, however, have yet to be subjected to this method. On smaller class sizes, I know of six experiments, the most recent and best known being the Tennessee class-size study. On smaller schools I know of only one randomized experiment, currently under way. In fact, most of what we know about education reforms currently depends on research methods that fall short of the technical standard used in other fields.

Equally striking is that, of the few randomized experiments cited above, nearly all were conducted by scholars whose training is outside the field of education. Educators Jeremy Finn and Charles Achilles began the best-known class-size experiment, but statisticians Frederick Mosteller, Richard Light, and Jason Sachs popularized the study, and economist Alan Krueger has conducted an important secondary analysis. Political scientist John Witte conducted the Milwaukee voucher study, while political scientists Jay Greene and his colleagues and economist Cecelia Rouse reanalyzed the data. Sociologists and psychologists conducted the Comer studies. Economists James Kemple and JoAnn Leah Rock are running the ongoing experiment on academies within high schools. Political scientist William Howell and his colleagues did the work on school-choice programs in Washington, D.C.; New York City; and Dayton, Ohio. Scholars with appointments in schools of education, where we might expect the strongest evaluations of school reform to be performed, evidence a 20-year near-drought when it comes to randomized experiments.

Such distaste for experiments contrasts sharply with the practices of scholars who do school-based empirical work but don’t operate out of a school of education. Foremost among these are scholars who research ways to improve the mental health of students or to prevent violence or the use of tobacco, drugs, and alcohol. These researchers usually have disciplinary backgrounds in psychology or public health, and they routinely assign schools or classrooms to treatments randomly. Randomized experiments are commonplace in some areas of contemporary research on primary and secondary schools. They’re just not being done by researchers who were trained in education schools.

Dealing with Complexity

In schools of education, the intellectual culture of evaluation actively rejects random assignment in favor of alternatives that the larger research community has judged to be technically inferior. Education researchers believe in a set of mutually reinforcing ideas that provides what for them is an overwhelming rationale for rejecting experiments on any number of philosophical, practical, or ethical grounds. Any Ph.D. from a school of education who was exposed to the relevant literature on evaluation methods has encountered arguments against experiments that appeared cogent and comprehensive. For older education researchers, all calls to conduct formal experiments probably have a “déjà vu” quality, reminding them of a battle they thought they had won long ago–the battle against a “positivist” view of science that privileges the randomized experiment and its related research and development model whose origins lie in agriculture, health, public health, marketing, or even studies of the military. Education researchers consider this model irrelevant to the special organizational complexity of schools. They prefer an R&D model based on various forms of management consulting.

In management consulting, the crucial assumptions are that 1) each organization possesses a unique culture and set of goals; therefore, the same intervention is likely to elicit different results depending on a school’s history, organization, personnel, and politics; and 2) suggestions for change should creatively blend knowledge from many different sources–from general organizational theories, from deep insight into the district or schools under study, and from “craft” knowledge of what is likely to improve schools or districts with particular characteristics. Scientific knowledge about effectiveness is not particularly prized in the management-consulting model, especially if it is developed in settings different from those where the knowledge is to be applied.

As a central tool of science, random assignment is seen as the core of an inappropriate worldview that obscures each school’s uniqueness, that oversimplifies the complicated nature of cause and effect in a school setting, and that is naive about the ways in which social science is used in policy debates. Most education evaluators see themselves as the vanguard of a post-positivist, democratic, and craft-based model of knowledge growth that is superior to the elitist scientific model that, they believe, has failed to create useful and valid knowledge about improving schools. Of the reasons critics articulate for rejecting random assignment as an evaluation tool, some are not very credible, but others are and should inform the design of future studies that use random assignment. Let’s deal with some of the major objections in turn.

The world is ordered more complexly than a causal connection from A to B can possibly capture. For any given outcome, randomized experiments test the influence of only a few potential causes, often only one. At their most elegant, they can responsibly test only a modest number of interactions between different treatments or between any one treatment and individual differences at the school, classroom, or individual level. Thus, randomized experiments are best when the question of causation is simple and sharply focused.

Lee Cronbach, perhaps the most distinguished theorist of educational evaluation today, argues that in the real world of education too many factors influence the system to isolate the one or two that were the primary causes of an observed change. He cannot imagine an education reform that fully explains an outcome; at most there will be just one cause of any change in this outcome. Nor can he imagine an intervention so general in its effects that the size of a cause-effect relationship remains constant across different populations of students and teachers, across different kinds of schools, across the entire range of relevant outcomes, and across all time periods. Experiments cannot faithfully represent a real world characterized by multivariate, nonlinear (and often reciprocal) causal relationships Moreover, few education researchers have much difficulty detailing contingencies likely to limit the effectiveness of a proposed reform that were never part of a study’s design.

There is substance to the notion that randomized experiments speak to a simple, and possibly oversimplified, theory of causation. However, many education researchers speak and write as though they accept certain contingency-free causal connections–for example, that small schools are better than large ones; that time on task raises achievement; that summer school raises test scores; that school desegregation hardly affects achievement; and that assigning and grading homework improves achievement. They also seem to be willing to accept some propositions with highly circumscribed causal contingency–for instance, that reducing class size increases achievement (provided that it is a “sizable” change and that the reduction is to fewer than 20 students per class); that Catholic schools are superior to public ones in the inner-city but not in suburban settings. Commitment to a full explanatory theory of causation has not precluded some education researchers from acting as if very specific interventions have direct and immediate effects.

Quantitative research has been tried and has failed. Education researchers were at the forefront of the flurry of social experimentation that took place at the end of the 1960s and through the 1970s. Quantitative studies of Head Start, Project Follow Through, and Title I concluded that, for all three programs, there were no replicable effects of any magnitude that persisted over time. Such results provoked hot disputes over the methods used, and many educational evaluators concluded that quantitative evaluation of all kinds had failed. Some evaluators turned to other methods of educational evaluation. Others turned to the study of school management and program implementation in the belief that poor management and incomplete implementation explained the disappointing results. In any event, dissatisfaction with quantitative evaluation methods grew.

However, none of the most heavily criticized quantitative studies involved random assignment. I know of only three randomized experiments on education reform available at the time. One was of the second year of “Sesame Street,” where cable capacity was randomly assigned to homes in order to promote differences in children’s opportunity to view the show. The second experiment was the widely known Perry Preschool Project in Ypsilanti, Michigan. The third involved only 12 youngsters who were randomly assigned to a desegregated school. Only the desegregation study involved primary or secondary schools. Thus it was not accurate to claim in the 1970s that randomized experiments had been tried and had failed. Only nonexperimental quantitative studies had been done, and few of these would pass muster today as even high-quality quasi-experiments.

Random assignment is not politically, administratively, or ethically feasible in education. The small number of randomized experiments in education may reflect not researchers’ distaste for them but a simple calculation of how difficult they are to mount in the complex organizational context of schools. School district officials do not like the focused inequities in school structures or resources that random assignment usually generates, fearing backlash from parents and school staff. They prefer it when individual schools can choose which reforms they will implement or when changes are made on a district-wide basis. Some school staff members also have administrative concerns about disrupting routines and ethical concerns about withholding potentially helpful treatments from students and teachers in need.

Surely it is not easy to implement randomized experiments of school reform. In many of the recent experiments, schools have dropped out of the experiment in different proportions, often because a new principal wanted to change what his predecessor had recently done, including eliminating the reform under study. Then there are the cases of possible treatment crossover, as happened in one of my own studies in Prince George’s County, Maryland. One principal in an experimental school was married to someone teaching in a control school, and they discussed their professional life at home; one control principal really liked the reform under study and tried to bring parts of it to his school; and the daughter of one of the program’s senior officials taught in a control school. In a similar vein, the Tennessee class-size experiment compared classrooms within the same schools. What did Tennessee teachers in the larger classes make of the situation whereby some colleagues in the same school taught smaller classes at the same grade level? Were they dispirited enough to work less? To avoid such possibilities, most public-health evaluations of prevention programs (such as those aimed at reducing drug use) use comparisons between schools instead of between classrooms within the same school. All randomized experiments in education have to struggle with issues like these.

What does it take to mount randomized experiments? Political will plays an important role. In the health sciences, random assignment is common because it is institutionally supported by funding agencies and publishing outlets and is culturally supported through graduate training programs and the broadly accepted practice of clinical trials. Public-health researchers have learned to place a high priority on clear causal inferences, a priority reinforced by their funders (mostly the National Institutes of Health, the Centers for Disease Control, and the Robert Wood Johnson Foundation). The health-related studies conducted in schools tap into this institutional and cultural structure. Similar forces operate with the rapidly growing number of studies of preschool education that use random assignment. Most are the product of congressional requirements to assign at random; the high political and scholarly visibility of the Perry Preschool and Abecedarian projects that used random assignment; and the involvement of researchers trained in psychology and microeconomics, fields where random assignment is valued.

Contrast this with educational evaluation. Reports from the Department of Education’s Office of Educational Research and Improvement (OERI) are supposed to detail what is known to work. However, neither the work of educational historian Maris Vinovskis nor my own reading of OERI reports suggests that any privilege is being accorded to random assignment. At a recent foundation meeting on teaching and learning, a representative of nine regional governors discussed the lists of best practices that are being widely disseminated. He did not care, and he believed that the governors do not care, about the technical quality of the designs generating these lists; the major concern is that educators can deliver a consensus on each practice. When asked how many of these best practices depended on randomized experiments, he guessed it would be close to zero. Several nationally known education researchers were present. They too replied that random assignments probably played no role in generating these best-practice lists. No one seemed to feel any distress at this.

Random assignment is premature because it assumes conditions that do not yet pertain in education. As the research emphasis shifted in the 1970s to understanding schools as complex social organizations with severe organizational problems, randomized experiments must have seemed premature. A more pressing need was to understand management and implementation, and to this end, more and more political scientists and sociologists of organizations were recruited into schools of education. They brought with them their own strongly held preference for qualitative methods and their memories of the wars between quantitative and qualitative methods in their own disciplines.

However, school research need not be predicated only on the idea of schools as complex organizations. Schools were once conceptualized as the physical structure containing many self-contained classrooms in which teachers tried to deliver effective curricula using instructional practices that demonstrably enhance students’ academic performance. This approach privileged curriculum design and instructional practice over the schoolwide factors that have come to dominate understandings of schools as complex organizations–factors like strong leadership, clear and supportive links to the world outside of school, a building-wide community focused on learning, and the pursuit of multiple forms of professional development.

Many important consequences have flowed from the intellectual shift in how schools are conceptualized. One is the lesser profile accorded to curriculum and instructional practice and to what happens once the teacher closes the classroom door; another is the view that random assignment is premature, given its dependence on expert school management and high-quality program implementation; and another is the view that quantitative techniques have only marginal usefulness for understanding schools, since a school’s governance, culture, and management are best understood through intensive case studies.

However, the aim of experiments is not to explain all sources of variation; it is to probe whether the school reform idea makes a difference at the margin, despite whatever variation exists among schools, teachers, students, or other factors. It is not an argument against random assignment to claim that some schools are chaotic, that implementation of a reform is usually highly variable, and that treatments are not completely faithful to their underlying theories. Random assignment does not need to be postponed while we learn more about school management and implementation.

Nonetheless, the more we know about these matters, the better we can randomize and the more management and implementation issues can be worthy objects of study within experiments. Advocates of random assignment will not be credible in educational circles if they assume that reforms will be implemented uniformly. Experimenters need to be forthright that school-level variation in implementation quality will often be very large. It is not altogether clear that schools are more complex than other settings where experiments are routinely done–say, hospitals–but most school researchers seem to believe this, and it seems a reasonable working assumption.

Thirty years after vouchers were proposed, we still have no clear answer about them. Thirty years after James Comer began his work that has resulted in the School Development Program, and again we have no clear answer. Almost 20 years after Henry Levin began Accelerated Schools; here too we have no answer. While premature experimentation is indeed a danger, these time lines are inexcusable. The federal Obey-Porter educational legislation cites Comer’s program as a proven program worth replicating elsewhere and provides funds for this. But when the legislation passed, the only available evidence about the program consisted of testimony; a dozen or so empirical studies by the program’s own staff that used primitive quasi-experimental designs; and the most-cited single study confounded the court-ordered introduction of the program with a simultaneously ordered reduction in class sizes of 40 percent. To be restricted to such evidence when making a decision about federal funding verges on the irresponsible.

Unlike medicine or public health, education has no tradition of multisite experiments with national reach. Single experiments of unclear reach, done only in Milwaukee, Washington, Chicago, and Tennessee, are what we typically find. Moreover, some kinds of school reform have no fixed protocol, and it is possible to imagine implementing vouchers, charter schools, or programs like Comer’s or Total Quality Management schools in many different ways. Indeed, the Comer programs in Prince George’s County, Chicago, and Detroit are different from one another in many major specifics. The nonstandardization of many treatments requires even larger samples than those typically used in medicine and public health. Getting cooperation from so many schools is not easy, given the history of local control in education and the absence of a tradition of random assignment. Still, larger individual experiments can be conducted than are being done today.

Random assignment is not needed because there are other less irritating methods for generating knowledge about cause and effect. Most researchers who evaluate education reforms believe there are superior alternatives to the randomized experiment. These methods are superior, they believe, because they are more acceptable to school personnel, because the knowledge they generate reduces enough uncertainty about causation to be useful, because the knowledge is relevant to a broader array of important issues than merely identifying a causal connection, and because schools are especially likely to use the results for self-improvement. No single alternative is universally recommended, and here I’ll discuss only two: intensive qualitative case studies and quasi-experiments.

• Intensive case studies. Cronbach asserted that the appropriate methods for educational evaluation are those of the historian, journalist, and ethnographer, not the scientist. Most educational evaluators now seem to prefer case-study methods for learning about reforms. They believe that these methods are superior because schools are less squeamish about allowing ethnographers through the door than experimentalists. They also believe that qualitative studies are more flexible. They provide simultaneous feedback on the many different kinds of issues worth raising about a reform–issues about the quality of implementation, the meaning various actors ascribe to the reform, the primary and secondary effects of the reform, its unanticipated side effects, and how different subgroups of teachers and students are affected. Entailed here is a flexibility of purposes that the randomized experiment cannot match, given its limited central purpose of facilitating clear causal inference.

A further benefit relates to schools actually using the results. Ethnography requires attention to the unfolding of results at different stages in a program’s implementation, thus generating details that can be fed back to school personnel and that also help explain why a program is effective. A crucial assumption is that school staff are especially likely to use a study’s results because they have a better ongoing relationship with qualitative researchers than they would have with quantitative ones. Of course, the use in question is highly local, often specific to a single school, while the usual aspiration for experiments is to guide policy changes that will affect large numbers of districts and schools.

The downside of case studies is the question of whether this process reduces enough uncertainty about causation to be useful. With qualitative methods it is difficult to know just how the group under study would have changed had the reform not been in place. The rationale for preferring an experiment over an intensive case study has to be the value of a clear causal inference, of not being wrong with the claim that a reform is effective or not. Of course, one can have one’s cake and eat it too, for there are no compelling reasons why case study methods cannot be used within an experiment to extend its reach. While black-box experiments that generate no knowledge of process may be common, they are not particularly desirable. Nor are they the only kinds of experiments possible.

• Quasi-experiments. Quasi-experiments are like randomized experiments in purpose and in most of their structural details. The defining difference is the absence of random assignment and hence of a demonstrably valid causal counterfactual. The essence of quasi-experimentation is the search, more through design than statistical adjustment, to create the best possible approximation of this missing counterfactual. However, quasi-experiments are second best to randomized experiments in the clarity of causal conclusions. In some quarters, quasi-experiment has come to connote any study that is not an experiment or any study that includes some type of nonequivalent control group or pretreatment observation. Indeed, many of the studies calling themselves quasi-experiments in educational evaluation are of types that theorists of quasi-experimentation reject as usually inadequate. To judge by the quality of the educational evaluation work I know best–on school desegregation, Comer’s School Development Program, and bilingual education–the average quasi-experiment in these fields inspires little confidence in its conclusions about effectiveness. Recent advances in the design and analysis of quasi-experiments are not getting into research evaluating education.

Moving Forward

It will be difficult to persuade the current community of educational evaluators to begin doing randomized experiments solely by informing them of the advantages of this technique, by providing them with lists of successfully completed experiments, by telling them about new methods for implementing randomization, by exposing them to critiques of the alternative methods they prefer, and by having prestigious persons and institutions outside of education recommend that experiments be done. The research community concerned with evaluating education reforms is a community in which all parties share at least some of the beliefs outlined above. They are convinced that anyone pursuing a scientific model of knowledge growth is an out-of-date positivist seeking to resuscitate debates that are rightly dead.

Some rapprochement might be possible. At a minimum, it would require advocates of experimentation to be explicit about the real limits of their preferred technique, to engage their critics in open dialogue about the critics’ objections to randomization, and to assert that experiments will be improved by paying greater attention to program theory, implementation specifics, quantitative and qualitative data collection, causal contingency, and the management needs of school personnel as well as of central decisionmakers.

Though it is desirable to enlist the current community of educational evaluation specialists in supporting randomized experiments, it is not necessary to do so. They are not part of the tiny flurry of controlled experiments now occurring in schools. Moreover, in several substantive areas Congress has shown its willingness to mandate carrying out controlled studies, especially in early-childhood education and job training. Therefore, end runs around the education research community are conceivable. This suggests that future experiments could be carried out by contract research firms, by university faculty members with a policy science background, or by education faculty who are now lying fallow. It would be a shame if this occurred and restricted our access to those researchers who know best about micro-level school processes, about school management, about how school reforms are actually implemented, and about how school, state, and federal officials tend to use education research. It would be counterproductive for outsiders to school-reform research to learn anew the craft knowledge insiders already enjoy. Such knowledge genuinely complements controlled experiments.

-Thomas D. Cook is a professor of sociology, psychology, education, and social policy at Northwestern University. This article is adapted from a chapter that will appear in Evidence Matters (Brookings, forthcoming).

The post Sciencephobia appeared first on Education Next.

Selective Reporting

Chester E. Finn, Jr. — Wed, 19 Jul 2006 00:00:00 +0000

Illustration by Timothy Cook

Quality Counts 2001, A Better Balance: Standards, Tests, and the Tools to Succeed
by the editors of Education Week
Editorial Projects in Education, 2001.

In just five years, Education Week‘s high-profile annual compilation Quality Counts (QC) has emerged as perhaps the K-12 education field’s most prominent source, besides the publications of the federal government, of statistical information, particularly at the state level. The reporters and editors of Education Week, which modestly styles itself “American Education’s Newspaper of Record,” prepare QC, with generous subsidy from the Pew Charitable Trusts. Appearing each January, QC typically runs to a whopping 200 folio-size pages.

Each successive edition of QC includes some familiar measures, drops some old categories, and adds some newly developed ones, the latter tied mostly to the year’s policy theme. This year’s theme was attaining “A Better Balance” between academic standards and tests on the one hand, and what the editors term “the tools to succeed” on the other. (In 2000, the theme was teachers; in 1999, accountability.) Besides thousands of numbers, QC features dozens of interpretive essays by Education Week reporters and editors–thus raising the dual specters of selective statistics and biased journalism.

We have no reason to doubt the bona fides of the editors, researchers, and advisors who choose the numbers and pen the essays. They presumably yearn to be interesting, timely, relevant, and influential. They want to get noticed and buzzed about. They want to sell copies, please their advertisers, gratify their donors, and ensure that next year’s edition is eagerly awaited (and chockablock with ads). If their report had no message, no conclusions, and no edge, it would be less noticed.

However, Quality Counts‘s numbers and essays certainly do not get treated as neutral entries in a wholly academic sweepstakes. In today’s education policy wars, for better or worse, no choice of a fact can be deemed wholly neutral. Facts are also weapons. Which ones you select matter a great deal. If, for example, you seek to convey to readers a sense of teacher salaries, it matters whether you report beginning salaries or those at the top rung; whether the focus is on the mean or the median; whether fringe benefits as well as cash wages are included; and whether, for perspective, teacher salaries are set alongside the earnings of bus drivers or neurosurgeons. (Teacher salaries didn’t appear in this year’s QC, but in 2000 average teacher salaries, adjusted for the cost of living, were reported, though not counted as part of each state’s “grade.”)

Framing the Question

Subjectivity begins, of course, with the selection and framing of the theme itself. In choosing this year’s “Better Balance,” for example, the editors signaled that something is awry in the existing balance between the “hard” elements of standards-based reform (namely the academic standards, assessments, and interventions that make up a state’s accountability system) and such “soft” components as teacher training, instructional materials, and classroom environment.

Concern about this balance is as old as the standards movement itself. For at least a dozen years, a debate has raged in Washington and in state capitals over what the profession generally calls “opportunity to learn” standards, or OTL. This concern is often captured by the aphorism “It’s not fair to hold students accountable for learning things they’ve never been taught.” According to OTL doctrine, policymakers mustn’t attend solely to standards and results. They should also concern themselves with the education system’s ability to ensure that those being held accountable have ample opportunities and resources to attain the desired results.

Reasonable, yes? Sure–but only up to a point. It’s a fact that education policymakers cannot confine themselves to goals and results. They also need to be reasonably confident that the available resources and institutional arrangements have a fighting chance of producing the desired outcomes–that salaries are high enough to draw talented applicants, that school districts can provide students with up-to-date textbooks and technology.

However, it is easy to lose one’s focus on results while bogging down in resource arguments. That seems to be just fine with those who are nervous about accountability in the first place. OTL is the chief means by which yesterday’s fixation on school inputs and services reasserts itself in today’s era of results-based education. Opportunity to learn–or what QC terms “the tools to succeed”–can become a handy, even virtuous, excuse for not holding anyone to account for actually teaching or learning anything, or at least for justifying mediocrity. There is always some inadequacy or shortcoming to be found somewhere in the vastness of the K-12 delivery system, not to mention the varied problems the kids bring to school. Hence, as one starts down the path of “balance,” a reason can readily be found to rationalize unsatisfactory outcomes or to defer the day when results actually count.

The education profession has persuaded itself that all the inputs must be exactly right before any results should count for students, much less for those who teach them and lead their schools. Consequences for adults in the education system are politically touchy anyway, so OTL-type excuses for skirting them are particularly welcome. In statewide accountability systems, the notion of “cracking down” on the kids is widespread. But we look far and wide before finding any teachers or principals in serious jeopardy. It’s as if only the soldiers and not the officers are being held to account for winning or losing the battle. Whenever someone suggests accountability for the educators, the furor that follows combines OTL concerns (what if the teachers didn’t have enough professional development? What if there was high turnover among their pupils?) with moral indignation and invocation of seniority rules, tenure laws, and contractual rights.

Captured by the System

The editors of Education Week have succumbed to OTL-type reasoning, more vividly in 2001 than in the preceding four editions of QC. “States,” they now write, “must balance policies to reward and punish performance with the resources needed for students and schools to meet higher expectations.” The fundamental message of QC 2001 is that such “balance” is lacking and needs to be developed.

Thus Quality Counts 2001 succors those made uneasy by standards-based reform and high-stakes testing. In so doing, it partakes of the central assumptions of the education profession itself and risks sliding over the edge into being a professional trade journal for educators, like, say, Phi Delta Kappan or Educational Leadership, rather than a watchdog on behalf of the broader American public.

Consider the report’s “Executive Summary.” The reader need penetrate only to paragraph three to find the caution lights flashing about standards and tests. The first paragraph reports that states have been trying hard to raise academic standards and that the public supports this effort. The second paragraph says that slow progress is being made. Then comes the big But. Paragraph three warns that, without a “better balance,” all this progress is in jeopardy, together with the life prospects of “tens of thousands” of youngsters. Paragraph four then closes in for the policy kill:

Specifically, Quality Counts found, state tests are overshadowing the standards they were designed to measure and could be encouraging undesirable practices in schools. Some tests do not adequately reflect the standards or provide a rich enough picture of student learning. And many states may be rushing to hold students and schools accountable for results without providing the essential support.

The full report has three major sections. Part I consists of six essays by Education Week reporters and editors, based partly on surveys and polls. Part II is the annual state-by-state report card, full of charts and tables assigning grades and rankings to the fifty states on their level of student achievement, progress in adopting standards and accountability, efforts to improve teacher quality, their school climate, and the resources they devote to education. Finally, in part III, come 80 pages of individual state profiles.

The essays in part I are troubling on several counts, beginning with their main source of “data,” which is a survey of public school teachers–and no one else. Teachers’ views on education warrant careful attention, of course, but they’re certainly not the only affected parties and they’re among the most self-interested. To learn about foxes, one wouldn’t settle for polling only chickens. The results of this survey are predictable: protests about narrowing the curriculum, teaching to the test, inadequate professional development, unfairness toward disadvantaged and minority youngsters–and toward hard-working teachers themselves. To their credit, these essays also profile some states, districts, schools, and teachers that are responding constructively to standards and testing. But as one browses these pages to see whose opinions (besides teachers) are taken seriously, it becomes clear that most of the interviews and quotations come from critics and doubters within the education profession. Where are the comments from legislators, employers, or college admissions officers? The key essay on testing, for example, written by QC uber-editor Lynn Olson, quotes five teachers, ten academics, one parent, and two policymakers. The overwhelming majority of these comments are negative or skeptical toward high-stakes testing.

The trappings of objectivity and scholarly rigor are certainly present in part II, the report card: endless charts, elaborate footnotes, and long methodological explanations written in tiny type. Here reportorial selectivity yields to subtler decisions about which data to include and how to interpret them. Project research coordinator Ulrich Boser boasts in the report card’s introduction (none too subtly entitled “Pressure without Support”) that the tables are based on the “most comprehensive to date” survey of “state policies that aim to hold schools and students responsible for results and build their capacity to reach academic standards.” In their effort to be contemporary, the researchers omit all sorts of long-term trends and patterns that might be even more revealing than the “very latest” data. For example, no effort is made to show the increase in public-school spending in America during the past 30 (or 50) years, the uses to which that money has been put, the steady reduction in class size, the huge increase in numbers of school employees, and the various trends in achievement that correlate almost not at all with any of these resource trends.

The data under the heading student achievement are fine. Its six subcategories are all based on states’ National Assessment of Educational Progress (NAEP) scores in various subjects in grades 4 and 8. The key barometer throughout is what fraction of a state’s youngsters scored at or above “proficient” on the NAEP scale. In 4th grade reading in 1998, for example, scores ranged from a low of 17 percent in Hawaii to a high of 46 percent in Connecticut. Eleven states didn’t take part.

So far, so good. It’s exactly what one would want from a publication named Quality Counts: a nice, clear focus on academic results, namely student achievement, measured on the best yardstick available.

Turning to standards and accountability, we encounter three major subheadings, two of which (accounting for 70 percent of this grade) are also pretty solid. Under “standards,” states get points depending on how many core subjects and levels of schooling they have “clear and specific” standards in, as judged by the American Federation of Teachers. Under “accountability,” a state’s score depends on how many of five different ways it holds schools (not just kids!) accountable for their performance. All are reasonable things to look for, albeit the most important of them–“sanctions” for failing schools–can be found in just 14 states (including jurisdictions with plans to institute sanctions at some later date).

The “assessment” subheading is more problematic. Here a state can get full marks only if it uses five different kinds of test items, including “extended response” questions and “portfolios.” A state that relied on multiple-choice questions could not possibly do well here. This partakes of the view fashionable among educators that multiple-choice testing is inherently inadequate because it cannot be used to appraise anything but the most rudimentary of skills and factual recall-type knowledge. Of course that’s not so. A well-conceived multiple-choice question can probe deeply into a student’s command of complex cognitive skills, prowess at problem solving, and sophisticated knowledge of subject matter. To be sure, multiple-choice items cannot expose a student’s ability to write lucid prose or engage in original research, but they can go a long way toward revealing the sorts of things we want youngsters to know and be able to do. Moreover, they do so with great efficiency and speed, and they are low cost, flexible (computer-adapted items), and objective (with machine-based scoring).

Larger problems loom in the report card’s three remaining areas. In the section on improving teacher quality, a state’s grade depends in part on its embrace of some of the education profession’s trendier “reforms.” Rather than probing the skills and knowledge that a teacher imparts to her students, for example, QC 2001 puts considerable weight on whether the state uses a “performance assessment” (including videotapes, portfolios, etc.) to appraise teachers. It also rewards states that give bonuses to teachers who have been certified by the National Board for Professional Teaching Standards. Unfortunately, we know from the work of economists Michael Podgursky and Dale Ballou and others that to date there is no hard evidence that being certified by the National Board translates into being an effective teacher.

QC also tacitly privileges the conventional education-school path into the classroom, though it no longer rewards states for having their new teachers emerge from “nationally accredited” institutions. QC 2001 does, however, assign points to states that require at least 12 weeks of practice teaching as part of a preparation program–not necessarily a bad thing, but limiting for states and districts that are experimenting with programs such as Teach for America and alternative pathways to certification. Indeed, QC grants no points to states with alternative-certification programs! (It did last year.)

The section on school climate has some good features. For example, a quarter of a state’s grade is based on having public-school choice and charter schools. Troubling, though, is the fact that 35 percent of the climate grade depends on having classes smaller than 25 pupils, which means that QC has taken sides in the great class-size debate, notwithstanding the rivers of doubt that Hoover Institution economist Eric Hanushek and others have poured on the notion that smaller classes are an efficient means of boosting achievement. The remaining 40 percent of a state’s climate grade addresses legitimate concerns such as classroom misbehavior, pupil tardiness, and the extent of parents’ involvement in school. Unfortunately, those indicators depend on self-reporting by 8th graders. While we shouldn’t fault QC‘s editors for the fact that these were the only such data they could find, we may wonder how reliable these numbers are.

The touchy topic of resources has two major subheads: adequacy and equity. Here is where one might most expect OTL doctrine to rule. Yet QC 2001 is even more primitive, relying instead on dollars alone. A state’s grade on resource “adequacy” turns not on some calculus of what resources are needed to furnish its youngsters with an adequate education, but simply on how rapidly the state’s education spending is rising and how much of the state’s total worth is being devoted to education. This section might be called “quantity counts,” and it yields some curious results.

West Virginia, of all places, gets the highest grade here–a straight A–as it reportedly spent $8,322 per pupil on public education in 1999 and has been boosting its outlays faster than any other state and digging deeper than all but one. Yet West Virginia is at or below the national average on all the QC achievement scores, gets a D+ for standards and accountability, a C for teacher quality and a D+ for school climate. By contrast, Connecticut, which also spent more than $8,000 per pupil and which is in first or second place among the states on four of six NAEP scores (and eighth in the remaining two) clocks in with just a B- in “resource adequacy.” Adequate for what, one wonders. Education Week‘s strange way of measuring adequacy lauds a state, like West Virginia, that has only recently begun raising its spending while punishing a state like Connecticut whose spending has been high for years. Likewise, West Virginia fares better than Connecticut because it is poorer; if both states spend exactly the same per pupil, West Virginia naturally winds up devoting more of its per-capita income to education.

The measure of resource “equity” is incomprehensible to anyone who has not specialized in school finance and earned a degree in statistics. Half of a state’s grade hinges on something called “state equalization effort”; the rest comprises still more obscure factors: the “wealth-neutrality score,” “relative inequality in spending per student among districts,” and something called the “McLoone Index.” Named for school finance analyst Eugene McLoone, it is “based on the assumption that if all the pupils in a state were lined up according to the amount their districts spend on them, perfect equity would be achieved if every district spent at least as much as was spent on the pupil smack in the middle of the distribution….The ratio between what is currently spent by districts in the bottom half and what needs to be spent to achieve equity is the McLoone Index.”

The equity upshot: Hawaii naturally wins, because its unified statewide school system spends the same amount on all students. Never mind that the Aloha State’s achievement scores are among the lowest in the land.

Something closer to objectivity reappears in the final section of QC 2001, where profiles of individual states are more balanced and informative than this reviewer expected. Each profile includes a report card recapping the state’s NAEP results and its letter grades reported in earlier pages. Each gives a few basic facts about school enrollments and demographics. Then each has an essay of a page or two about what’s going on in that state. The authors are Education Week reporters who seem to have been given a fairly free hand to frame a state’s story according to what they found interesting there and with whom they talked.

Most of the essays are sober, matter-of-fact accounts of recent doings on the education reform front. The Indiana essay is a model of that kind, as are those of Louisiana, Maine, and Delaware. Some report interesting information that national observers may not have known, such as Nebraska’s abiding love of locally selected tests and its rejection of statewide assessments. Some report that heated controversies–such as the uproar surrounding Florida’s voucher program–are cooling down. Even some places where recent developments could lend themselves to a reporter’s bias against testing don’t always produce the expected “spin.” The Massachusetts account, for example, is acceptably balanced, as are those for Colorado and high-profile Texas. There are occasional slips, however. The Ohio story, for one, tends to favor the views of those who are grumping about the state’s proficiency testing program.

On balance, however, this sprawling publication displays an unmistakable, albeit uneven, set of assumptions that align with the values, preferences, and biases of the education profession itself. It thus becomes more of a report to the profession on matters that interest people within the field than a report to the public about how well that field is serving the nation.

Perhaps we shouldn’t be surprised. Most of Education Week‘s and QC‘s subscribers, after all, are educators, and most of the advertisers are firms that want to sell things to educators. This inevitably tempts reporters, editors, and publishers to view the world through the lenses of readers within the field rather than outsiders who most want to know whether the system is performing as well as it should. “Give educators what they want to see” may never have been stated in planning meetings and editorial sessions. Possibly all that happened is that the authors and their advisors and supervisors have been so close to the K-12 education system for so long that they’ve lost perspective on it and its players. They may even suffer from a touch of the Stockholm syndrome, identifying with their oppressors–their customers, in Education Week‘s case. Whatever the reason, the unhappy bottom line is that quality does not count quite as much as it should in Quality Counts.

Chester E. Finn Jr. is president of the Thomas B. Fordham Foundation, a senior fellow at the Manhattan Institute, and a visiting fellow at the Hoover Institution.

The post Selective Reporting appeared first on Education Next.

Education Next

Education Next — Wed, 19 Jul 2006 00:00:00 +0000

Welcome to the fall 2001 issue. Our first two issues, in the spring and summer of 2001, ran under the banner Education Matters. With this issue, in order to avoid conflicts with other entities using that name, we become Education Next. Our commitment to publishing incisive commentary and research of the highest caliber remains as firm as ever.

Why Education Next? To our minds, Next nicely characterizes a journal that is forward looking, that thinks beyond the status quo. More subtly, Next suggests that education’s time has come, that this sector is ripe for major change.

The United States continues to become more open, dynamic, flexible, and responsive. Civil rights activists broke down classifications by color and nationality. Women’s organizations restructured the workplace. Transformed communications give the quick an edge over the large. Disciplined by global markets, firms now must emphasize productivity and innovation or they die. At the same time, environmental regulations have asked businesses to pay as much attention to externalities as to profits. The citizen and customer are increasingly in charge.

Education is next. In years past, school reform meant bigger schools, more “comprehensive” systems, and more tightly centralized control. Today’s reform trends are quite the opposite. Expanded school choice, whether by charter, magnet, homeschooling, voucher, or inter-district arrangements, is shifting control downward-to teachers, parents, and local administrators. Education is becoming a modern organization, accountable for its results and performance-both to its clients and to the larger society.

No one can be sure where all this will wind up. But we are certain of an intense and fascinating debate, as ideas sparkle, policies are tried, programs are evaluated. Our goal in Education Next is to keep you abreast of what is happening-and to discuss what should be happening. In this issue we explore four of the liveliest topics in this debate: accountability, choice, teacher unions, and education research.

The Houston story celebrates the marriage of a state accountability system and an urban school board and superintendent with a laser-like focus on reform. However, David Steiner in “High-Stakes Culture” and Lauren Resnick in “The Mismeasure of Learning” warn us not to draw strong conclusions prematurely. What appear to be striking gains on minimum-competency tests may leave us well short of where the nation wants and needs to be.

The impact of choice on civil society also gets serious attention. David Campbell’s research piece finds reason to praise the nation’s largest private system of schools, the Catholic schools, for their ability to graduate students with a healthy regard for our political traditions and a commitment to social action, but he also finds signs that private schools with other religious affiliations may not have such excellent records. In “Choice, Testing, and the Jigsaw” in the Forum section, Diane Ravitch and Nathan Glazer show how the very concept of a common culture has evaporated in the public schools even as Steiner worries about the testing culture that may be replacing it.

What is the next role that teacher unions will play? Opinions, not surprisingly, differ sharply. Terry Moe in “A Union by Any Other Name” reminds us that the very purpose of a union is to create and protect a monopoly. Charles Kerchner in “Deindustrialization” and Adam Urbanski in “Reform or Be Reformed” argue, however, that teachers are professionals and that today’s teacher unions have genuine potential for reform.

If education’s next steps are to be rooted in quality information, we need to strengthen our systems for creating and distributing knowledge. Thomas Cook’s examination of education research in “Sciencephobia” raises serious questions about its scientific quality. Senior editor Chester Finn in “Selective Reporting” wonders whether Education Week‘s annual effort to survey the quality of American schools has the objectivity one should expect.

All issues of the journal-by whichever name-are available on newsstands or by subscription. We also invite you to visit our new website, www.educationnext.org, where you will find unabridged versions of many articles.

— The Editors

The post Education Next appeared first on Education Next.

Houston Takes Off

Education Next — Wed, 19 Jul 2006 00:00:00 +0000

A unique blend of education-savvy business leaders, a superintendent with stamina, and a mature accountability system has made Houston into the darling of urban school reform. Will success survive Rod Paige’s exit?

The city’s test scores are rising. The school board is a model of cooperation. Rod Paige may have set a new world record for longest-lasting urban superintendent. Experiments with charters and outsourcing have yielded success stories like the KIPP Academy. Yet now the most obvious sign of Houston’s successâ€”its superintendent’s becoming the secretary of educationâ€”has created a situation as challenging as any the district has faced. Few urban school reform efforts ever survive a change at the top. How can Houston become an exception to the rule?

Jane Hannaway and Shannon McKay of the Urban Institute plumb the data

Marci Kanstoroom asks what role does the school district play in standards-based reform?

Paul T. Hill reminds Houston to stay focused as it transitions to new leadership

The post Houston Takes Off appeared first on Education Next.

Military academies; do teachers matter?

Education Next — Wed, 19 Jul 2006 00:00:00 +0000

Congratulations to Oakland mayor Jerry Brown on his plan to open a military academy (see “A Few Good Schools,” Summer 2001). Chicago’s experience with military academies has been overwhelmingly positive. I hope Oakland’s is equally successful.

In 1999 Chicago opened its first public military high school, the Chicago Military Academy at Bronzeville, in a historic African-American neighborhood. Last year the city began converting Carver High School on the Far South Side into its second military school. Both schools are part of the Chicago Public Schools system, not charter schools like Mayor Brown’s academy.

We started these academies because of the success of our Junior Reserve Officers Training Corps (JROTC) program, the nation’s largest. JROTC provides students with the order and discipline that is too often lacking at home. It teaches them time management, responsibility, goal setting, and teamwork, and it builds leadership and self-confidence.

Not surprisingly, the high-school graduation rate for JROTC students in the Chicago Public Schools is 20 percent greater than the citywide average. It’s a little early to measure the success of our academies, but the first class at Bronzeville scored 40 percent better than the citywide average in reading and 30 percent better than the average in math. Perhaps a clearer sign of success is that 1,300 students applied for 110 openings in Bronzeville’s next entering class.

Though a military academy isn’t for everyone, for some it is just what they need in order to make something of their lives.

Mayor Richard M. Daley
Chicago, Illinois

Do teachers matter?

Let me respond to a few points in Michael Podgursky’s review of my report “How Teaching Matters: Bringing the Classroom Back into Discussions of Teacher Quality” (See “Flunking ETS,” Check the Facts, Summer 2001). First, the notion that the study is tilted in favor of measures of classroom practice and against measures of socioeconomic status (SES) is incorrect. Adding up the effects for all the classroom-practice variables is standard procedure. The fact that very few teachers engage in all of the effective practices does not invalidate the procedure; it merely suggests that there is room for teachers to improve. Nor does this procedure give classroom practices an advantage over socioeconomic status. The socioeconomic variable was also created by adding together six items-in this case, before the models were estimated. The socio-economic measures are certainly not as rich as would be desirable, but neither are the measures of classroom practice.

Second, the use of data from the 8th grade National Assessment of Educational Progress (NAEP) in this study is appropriate. It is certainly true that data from just one year cannot be used to establish a causal relationship, as I note in the report. Cross-sectional data are properly used, as they were in this study, to confirm the findings of more robustly designed small-scale studies. Without such validation, it is difficult to know if the findings of small-scale studies will hold true for different students in different schools. The advantage of longitudinal over cross-sectional studies also should not be oversold. Although the outcome in a cross-sectional study is a student’s test score, and the outcome in a longitudinal study is improvement in a student’s test score, neither case demonstrates that the teacher caused the outcome. Demonstrating a causal hypothesis requires an experimental design.

Harold Wenglinsky
Educational Testing Service
Princeton, New Jersey

Michael Podgursky’s latest target in his ongoing war on the National Board for Professional Teaching Standards is a study authored by my colleagues and me at the University of North Carolina, Greensboro (see “Defrocking the National Board,” Check the Facts, Summer 2001). Here I’ll answer just a few of his concerns (an extended reply is available at www.educationnext.org).

Podgursky questions the ways in which we measured student achievement. The measures used in the study were 1) writing assignments in response to prompts designed by experienced teachers, and 2) assessments of the depth of student understanding of concepts targeted in instruction. Podgursky claims that without statistical controls for students’ background characteristics, these data are suspect.

Were standardized, multiple-choice tests used as the measure of student achievement, an adjustment for students’ socioeconomic status would certainly have been appropriate. Performance on standardized multiple-choice tests is affected by many factors not under teachers’ immediate control. In this study, however, we were interested in the students’ depth of understanding of concepts from a unit designed by each teacher for the students in her class. Whether socioeconomic differences affect student achievement under these circumstances depends primarily on the quality of the observation protocols, the quality of the training the observers and assessors received, and their skill in applying those protocols. On that score we make no apologies whatsoever.

Podgursky asserts that neither this study nor any other “has ever shown that National Board-certified teachers are better than other teachers at raising student achievement.” The basis for this assertion is Podgursky’s belief that the only way to measure the performance of teachers is to examine the performance of their students on standardized multiple-choice tests. We believe that setting worthwhile instructional goals is a crucial aspect of accomplished teaching, and the extent to which students have met the goals set by their teachers is a rational and utterly defensible measure of student achievement.

The National Board uses highly trained teachers in the relevant disciplines to evaluate the submissions of teachers seeking advanced certification. Podgursky questions the National Board’s use of teachers’ peers in the evaluation of their work, though this is common practice at the college level, and calls for evaluation by principals and parents instead. Principals can evaluate some aspects of a teacher’s performance (attendance, classroom management) but are in no position to evaluate other critical teaching attributes, such as in-depth subject-matter knowledge and the ability to present content in developmentally appropriate ways that engage students. Most parents know even less than principals about what goes on in actual classrooms or how teaching practices should be evaluated.

Podgursky asserts that we chose a particular sample of teachers in order to stack the deck in favor of positive findings. In fact, we chose a sample with teachers who were close to the certification score, as well as teachers who were clearly above and clearly below the certification score, in order to enrich our understanding of the score scale. An argument can be made for random sampling of teachers since it facilitates generalization, but it is highly unlikely that random sampling would have resulted in a materially different outcome, as the mean scores for the relevant populations of certified and noncertified teachers were quite similar to those of the study sample.

The National Board’s system for identifying accomplished teachers is the most comprehensive assessment of actual teaching practice yet devised. Teachers must provide evidence of professional accomplishments and the involvement of community resources and students’ families in the educational process. They must demonstrate their knowledge of their subject matter and their ability to select developmentally appropriate curricular materials to teach that content. These are rigorous, research-tested evaluation criteria that, as our study shows, are clearly identifying teachers of high professional caliber.

Lloyd Bond
University of North Carolina, Greensboro
Greensboro, North Carolina

Michael Podgursky replies: I agree that the cross-section correlations in Harold Wenglinsky’s study do not support causal interpretations. However, his report is replete with prescriptive statements and policy recommendations. These recommendations are based not on experimental research or a larger body of studies, but on the findings in this study.

I explained carefully why the adding-up exercise that forms the basis for his conclusion that “teachers matter most” is flawed. Wenglinsky implicitly concedes this point and now states that the result of his exercise “merely suggests that there is room for teachers to improve.” This too is a causal interpretation, but it is at least more cautious and defensible than the original.

Two basic design flaws exist in the study conducted by Lloyd Bond et al. First, as Bond concedes, the study over-sampled high-scoring certified and low-scoring noncertified teachers, thus exaggerating the measured differences in quality between the two groups. Bond states that it is “highly unlikely” that he would have obtained different results had he sampled randomly. This is entirely speculative-the data presented on mean test scores are irrelevant on this point. The authors’ claim that “board certified” teachers “significantly” outscored their noncertified counterparts on 11 of 13 dimensions of good teaching is based on a flawed statistical test.

Second, the researchers failed to control for previous test scores or socioeconomic differences between the students of the certified and noncertified teachers. As I noted in my review, the children taught by the noncertified teachers were disproportionately low income. Bond does not dispute this, but makes the extraordinary claim (without evidence) that the methods used by his trained observers make it unnecessary to control for students’ backgrounds or previous achievement. A rigorous study design would have compared certified with noncertified teachers who work with similar student populations, or it would have collected extensive control data on students’ backgrounds. The researchers did neither.

The “highly trained teachers” the National Board uses as scorers are moonlighting teachers who receive two to four days of training. They are paid $125 a day, and most have not passed the certification assessments they are grading. In fact, peer review of teaching in higher education is entirely localized and bears no resemblance to the centralized and very costly process of the National Board. There is no national “certification” of experienced college teachers.

Although several hundred million dollars have been invested in board certification and bonuses to date, no one has yet undertaken a rigorous study of whether the students of board-certified teachers learn more and whether the board-certification process is a cost-efficient way to identify superior teachers.

The post Military academies; do teachers matter? appeared first on Education Next.

Taking Measure

Jane Hannaway — Wed, 19 Jul 2006 00:00:00 +0000

Illustration by Joseph Daniel Fiedler

A recent Council of the Great City Schools report hailed Houston for “beating the odds” by generating sizable gains in student achievement. Much of this is no doubt due to its accountability system. Even though accountability is increasingly recognized as the linchpin of education reform, only a few states have made real progress in establishing accountability systems. It appears that they have much to learn from the experiences of Texas and the Houston Independent School District (HISD).
The Texas educational accountability system has been in place since 1993. It is based mainly on student performance on the Texas Assessment of Academic Skills (TAAS), which is administered to students in grades 3 through 8 and in grade 10. The test is aligned with the state’s standards and measures performance in reading and math; writing in the 4th, 8th, and 10th grades; and science and social studies, but only in the 8th grade. In the past four years, the Houston school system has also been giving the Stanford 9 test to students in order to benchmark the achievement of its students against students nationwide.

Districts report the percentage of students who “pass” the TAAS at each school and for the district as a whole. The state then classifies schools and districts into four performance categories on the basis of test scores, dropout rates, and attendance rates. The categories are: “exemplary,” “recognized,” “acceptable,” and “low performing.” Over time the state has moved the bar for schools and districts steadily higher. For example, last year at least 50 percent of a district’s students needed to pass the TAAS in order for a school or district to be rated “acceptable.” This was up from 30 percent a few years earlier. It is expected that the bar will continue to rise by 5 percent a year until the passing standard reaches 70 percent.

The Texas system involves both financial rewards and relief from regulations for high-performing schools and assistance and sanctions for low-performing schools. Districts and schools receiving the lowest accountability rating are visited by a peer-review team and must develop an improvement plan. If the low rating continues for two years or longer, the state can intervene more directly, for example by taking over the school. Parents may also transfer their children from a low-performing school to a higher-performing public school. Houston is unique in that it provides much more targeted assistance to low-performing schools than other districts.

The test scores of all eligible students who are registered in a district in the October listing of the Public Education Information and Management System (PEIMS) are included in the state accountability system’s performance measures. While this means that some of the students, whose test scores are included in the school’s performance measure, may have only been in that school for a relatively short time, it avoids problems associated with excluding the high-mobility students-typically the lowest-performing students-from the district’s overall accountability measure. HISD goes one step further and includes all eligible students in a school at the time of testing, regardless of where they were in October. The intent, as one district official noted, “is to make the school feel responsible for every student.”

Before 1999, special-education students were excluded from state accountability measures, but since then special-education students who are receiving instruction on grade level are also included. Houston went even further by including all special-education students, even those not on grade level, in its testing program, except those classified as multiply impaired, mentally retarded, emotionally disturbed, autistic, hearing impaired, or having a traumatic brain injury.
One of the special features of the Texas plan is that performance statistics have to be reported for different student subgroups: African-Americans, Hispanics, whites, and economically disadvantaged students. These subgroup ratings weigh heavily in the overall performance rating for a school or district because the rating given by the state is based on the lowest performance on any single criterion (TAAS, dropout rate, attendance rate) for any subpopulation. Thus, even if the majority of the students in a school were performing well, if its economically disadvantaged students were performing poorly in math, it would receive an “unacceptable” rating overall.

Houston rates schools not only on their level of performance but also on their progress. Progress is judged against the amount of improvement a school is expected to make, with lower-performing schools expected to make more progress. For instance, schools with a passing rate between 60 percent and 75 percent are expected to improve by 4 percentage points; schools whose passing rate falls between 45 percent and 60 percent are expected to improve by 6 percentage points. This enables the district to recognize schools that are making good progress even if they have not moved into a higher performance level.

Student Performance

We examined data from the state and the district to answer several questions about Houston’s performance. First, has student performance in Houston improved over time? Here we looked at data for the entire district and broken out by students’ race/ethnicity and socioeconomic status. Second, how does Houston perform relative to schools statewide and in other urban districts? Third, to what extent is Houston closing the gap between minority students and white students?

Measuring Houston against the state of Texas is holding the city to a high standard. Research from the RAND Corporation, among others, shows that, after adjusting for student background, Texas and North Carolina made greater leaps on the National Assessment of Educational Progress from 1990 to 1997 than any other states. This research also shows that Texas appears to have been particularly successful in closing the achievement gap between minority students and white students. While other research from RAND suggests that the TAAS results may overstate the amount of “real” learning gains in Texas, the issue here is not whether the TAAS captures all the areas of learning assessed, for example, by the NAEP, but the extent to which performance is improving on those areas measured by the TAAS, which is keyed to the state’s curriculum and serves as the basis of the state’s accountability system. We did, however, examine how closely the TAAS results correlate with results from the Stanford 9, a nationally normed test. An analysis of school-level data by grade for reading and math in 1999 and 2000 showed large and highly significant correlations, suggesting that schools that perform well on the TAAS are also likely to perform well on nationally normed tests. We confine our discussion here to the TAAS results, using the passing rates set by the state, and make comparisons between Houston and the state and between Houston and other urban districts using this one measure.

Comparing Houston with the state. As measured by the TAAS, Houston has made great strides in student achievement. From 1994 to 2000, the number of Houston students passing the TAAS in math increased from 49 percent to slightly more than 80 percent. Houston inspired large gains in reading as well, with pass rates going from 65 percent to 81 percent (see Figure 1). These gains exceeded the gains statewide. In math, the pass rate statewide rose from 61 percent to 87 percent, an increase of 26 percentage points (compared with a 31 percentage point increase in Houston). The statewide reading pass rate rose from 77 percent to 87 percent, an increase of 10 percentage points (compared with a 16 percentage point increase in Houston).

Neither the district nor the state has experienced steady gains in achievement over time. Test scores at the district and state levels increased sharply in the beginning years and started to level off in 1998. Indeed, in 1999 Houston’s reading and math scores actually dropped, and the state showed a slight dip in reading. Houston’s drop and the leveling of state performance are likely due, at least in part, to changes in the pool of students required to take the test. As noted earlier, special-education students who were receiving instruction at grade level were included in the state’s testing system for the first time in 1999, and Houston imposed an even more inclusive policy. Similarly, Houston was less likely to exempt students with limited English language skills than was the state, a practice that may also have contributed to the 1999 drop in the district’s performance. In short, the 1999 data for Houston probably include a larger fraction of lower-performing students than those included in the state’s measure. It is important to note that Houston rebounded in 2000, posting a gain of nearly 6 percentage points in reading and math.

In both reading and math, while the absolute performance of white students in Houston is higher than that of Hispanics and African-Americans, it is clear that the upward trend in performance for minority students in Houston is steeper than that of whites.

Comparing Houston with other urban districts. Arguably, Houston experienced more improvement than the state because from the beginning its scores were so low. In 1994 Houston’s pass rates in math and reading were 49 and 66 percent, respectively; the corresponding rates for Texas that year were 61 percent in math and 77 percent in reading. The presumption is that the higher the initial scores, the more difficult it is to elicit substantial gains; in other words, the state’s scores were approaching a “ceiling.” A behavioral argument could also be made: that the state’s reform policies, its public shaming and sanctions for low-performing schools, would most strongly influence the behavior of urban districts, which tend to have a history of low performance and mismanagement. It is thus important to examine how Houston fares relative to Texas’s other major urban districts.

In 1994 Houston’s performance on the TAAS in math placed it just below the middle of the pack of six large urban districts in Texas (see Figure 2). Three of the five other urban districts outperformed Houston in 1994, though some only slightly; in 2000, one district, El Paso, equaled Houston’s performance. A similar pattern held in reading. Houston’s improvement looks even better considering the city’s increases in passing rates. Houston, by far the largest of the six urban districts, made greater progress in reading and math than all but one of the major urban districts, San Antonio, which showed remarkable gains.

Closing the Achievement Gaps

The large gap in student performance between white students and minority students is one of the most serious problems facing the United States. The nation has watched Texas not only for the improving performance of its students across the state, but also for the shrinking achievement gap between white and minority students as measured by the TAAS. Here we examine the progress that Houston is making on closing the achievement gaps. Again, we examine Houston’s progress relative to the state and relative to other urban districts.

Comparing Houston with the state. In both reading and math, while the absolute performance of white students in Houston is higher than that of Hispanics and African-Americans, it is clear that the upward trend in performance for minority students in Houston is steeper than that of whites. The math scores of African-Americans in Houston, for instance, rose by 34 percentage points; for Hispanics, by 36 percentage points; and for whites, by only 14 percentage points (see Figure 3). Houston’s results roughly mirror those of the state.

Note that the pass rates for white students are very high, well above 90 percent, suggesting that the skills tested by the TAAS are not too challenging and that many students are close to “topping out” on the test. However, this does not diminish the significance of the substantial increases in performance on the skills tested, especially for minority students. The gaps in the passing rates between whites and African-Americans and between Hispanics and whites in both math and reading declined from 1994 to 2000. Houston reduced the white/Hispanic gap in math from 36 percentage points in 1994 to 14 percentage points in 2000, a narrowing of 22 percentage points. The state as a whole reduced the same gap by 15 percentage points. Still, the gap statewide between whites and minorities is somewhat smaller than it is in Houston.

Houston, by far the largest of the six urban districts, made greater progress in reading and math than all but one of the major urban districts.

Comparing Houston with other urban districts. In both reading and math, all six urban districts included in our analysis reduced the gap in TAAS passing rates between whites and African-Americans and between whites and Hispanics. Houston’s performance is especially noteworthy. Houston decreased the white/African-American achieve- ment gap in math by 21 percentage points, more than all the other districts. It also reduced the white/African-American gap in reading by more than all the other districts. Houston’s reductions in the white/Hispanic gap were equally impressive: the gap in math dropped by 22 percentage points, more than any of the other urban districts.

Whatever one thinks of the TAAS, Houston is clearly doing something right. Its progress on the TAAS, for the most part, has outstripped the gains of the state and most other urban districts. The district at least has begun to solve one of society’s most intractable problems: the achievement gap between white and minority students, at least as measured by the TAAS. Without the information provided by the Texas accountability system, few people would have noticed what was happening. What a loss that would have been!

-Jane Hannaway is the director and principal researcher of the Education Policy Center at the Urban Institute. Shannon McKay is a research associate in the Education Policy Center.

The post Taking Measure appeared first on Education Next.

Balancing Act

Marci Kanstoroom — Wed, 19 Jul 2006 00:00:00 +0000

Within the evolving standards and accountability movement, states (rather than the nation or school districts) have borne the responsibility to develop standards, tests linked to those standards, and a system of rewards and punishments for schools depending on their performance. The target of accountability has generally been individual schools and students. In most states, this neat arrangement practically ignores school districts.

There are two grand conceptions of the role local school districts should play once a statewide accountability system is in place. One says the district should essentially disappear. In this view, the point of standards-based reform is to free schools from regulations; each school would then operate much like a charter school, with its principal acting as a CEO. Proponents of decentralization ask, How can a school be held responsible for its results if the district is forever meddling in its operations?

The alternative role is for districts to do everything in their power to align curricula and teaching practices in all their schools with the state’s academic standards. In this view, school districts should support standards-based reform by identifying effective methods of instruction and ensuring that all schools are delivering the material covered by the standards using the techniques approved by the district. Proponents of such top-down management argue that many schools would simply fail if they were left to sink or swim on their own, with no assistance from the district.

Against the backdrop of Texas’s fully developed system of standards and assessments, the Houston Independent School District’s approach has been to try to find a third way, not telling schools exactly what to do, but not stepping aside either. Schools are given considerable autonomy, but the district is proactive in its efforts to give them the direction, training, and resources they need to boost student achievement.

While standards-based reform relies on the ability of incentives to motivate students and teachers, teachers must be given the opportunity to develop the knowledge and skills they need to provide more ambitious instruction, or they must be replaced by teachers who have the knowledge and skills. In other words, the reform strategy is incomplete if it doesn’t include ways of increasing the capabilities of schools. Houston’s efforts to improve instruction were focused on building teacher capacity in three areas: reading, math, and curriculum alignment.

A Balanced Approach

In 1995 one-fourth of 5th graders and one-third of 6th graders in Houston failed the Texas Assessment of Academic Skills (TAAS) in reading. This was a matter of great concern to then-superintendent Rod Paige and the local business community.

The district was in the process of decentralizing many of its operations and decisionmaking, but Paige determined that reading instruction would have to be an exception. The idea that the district needed to adopt a single, uniform approach to reading instruction grew out of one of the superintendent’s monthly meetings with the district’s teachers of the year, one elected from each campus. They told the superintendent that the high mobility rate of children within the district would prevent reading scores from improving unless there was a consistent approach to reading across the district, so that students who changed schools would not receive bits and pieces of different methods of instruction.

Paige quickly convened a task force and charged it with developing a research-based approach to reading for the district and a plan for starting the program in all schools. The approach was to deal with the “reading wars” by bringing together supporters of various methods of instruction, having them consult with outside experts, and asking them to agree on a recommendation for the entire district.

The task force produced an 85-page report called “A Balanced Approach to Reading.” Despite the word “balanced,” the new philosophy would mean a significant shift away from “whole language” instruction to phonics. “Balance does not mean mindless eclecticism,” the report warned; “balance involves a program that combines skills involving phonological awareness and decoding with language and literature-rich activities.”

Training in the new reading program began in the fall of 1996. By the summer of 2000 it had reached almost 12,000 teachers. The district trains all new teachers, elementary and secondary, in the balanced approach.

Houston has mobilized a large number of specialists to bring effective methods of reading instruction to teachers. The 12 administrative subdistricts within the Houston school district hired teacher trainers to provide continuing support, working with teachers individually (in their classrooms) and in small groups. These trainers visit schools to ensure that the principles of the balanced approach are being put into practice. Trainers work most intensively with new teachers, teachers who request extra help, and those whose principals have requested extra help on their behalf. The district has also aimed to create a cadre of people with reading expertise in the schools themselves; most elementary schools have lead reading teachers. While the district has insisted that all teachers and schools adopt the “balanced” approach to reading, schools are given considerable flexibility in matters of instructionâ€”as long as they put instruction in phonics front and center. The district describes its approach as a philosophy, not a package that is delivered to the door of schools.

Houston attempts to address reading problems proactively instead of remedially. The district requires all elementary schools to spend 90 minutes a day on reading instruction, and schools assess students early and often to identify problems and potential problems.

While all teachers are supposed to be observed as part of the state’s teacher assessment system, the reading initiative and programs like Success for All have created an environment that makes observation and monitoring a normal part of daily life for teachers. This contrasts with a typical school district, where teachers are observed once or twice a year, usually with plenty of advance warning, which is often required by the collective-bargaining agreement.

When using a program like Success for All, teachers become accustomed to having facilitators come in and out of their classrooms to observe and to deliver model lessons. Houston’s reading initiative has brought this to schools that aren’t using a comprehensive school-reform model. This has turned out to be a very effective way of producing school-level change. By making observation and advice a part of the teacher’s normal routine, particularly through the use of reading teacher trainers, the district has made it easier to target the use of ineffective teaching practices and to help struggling teachers improve.

Algebra for the Masses

The math initiative was launched in the fall of 1995. The previous spring, only 49 percent of all Houston students had passed the TAAS in math; only 36 percent of 8th graders had passed the test. Superintendent Paige asked his staff to examine the transcripts of all middle-school math teachers in the district to determine how many math courses they had taken in college. He was troubled to learn that many teachers lacked adequate preparation in mathâ€”40 percent had taken fewer than 12 credit hours of mathâ€”and he concluded that the district would have to teach math to some of its math teachers. The district quickly introduced a series of math courses and summits for teachers and principals.

The reading initiative has created an environment that makes observation and monitoring a normal part of daily life for teachers.

At the time, Texas required all students to take algebra in order to graduate, though it was not necessary to pass the state’s end-of-course exam. That was fortunate for Houston’s students, because only 15 percent of them passed the state end-of-course exam in the spring of 1997 (in fact only 18 percent of students passed statewide).

During the 1997â€“98 school year, the district launched an algebra initiative as an offshoot of the math initiative. As a first step, a districtwide syllabus was developed so that the state’s expectations in algebra would be clear to students and teachers. This marked a major change for most teachers, who had been accustomed to basing their instruction on the textbook. However, the state exam was based on assumptions different from those contained in the textbook the district had been using for the past six years.

School-level planning teams were organized for all Algebra I teachers (from high schools and middle schools), and a districtwide meeting was held (at first weekly, later monthly) for a representative from each team. The goals were to increase teachers’ knowledge of the skills covered by the state’s academic standards and tested on the algebra end-of-course exam and to provide support for teachers in the use of new teaching methods.

Beginning in the second year of the algebra initiative, schools were asked to send a sample of student work to the district office each month to demonstrate their use of improved instruction. After two years, the schools were no longer required to hold weekly meetings for their planning teams; instead, unsuccessful algebra teachers (those whose teaching allowed fewer than 15 percent of their students to pass the end-of-course exam last year) would be pulled out of their classes for eight days of in-depth training.

After test results showed that students entering middle school were strong in computational skills but weak in problem-solving and the application of mathematical concepts, prerequisites for success in algebra, the algebra initiative was extended to middle-school teachers who weren’t teaching algebra and to 5th grade teachers.

The algebra initiative was an attempt to integrate planning and collaboration into the routine of the school. At both the middle- and high-school levels, the initiative seems to have worked well for many teachers, but a significant minority of teachers has chosen not to participate. Attendance at the meetings has been incomplete: “Some just won’t do it. Some feel it infringes on their academic freedom, some just won’t be bothered,” an administrator told me.

Since the algebra initiative was launched, Houston has seen the passage rate for high-school students on the state’s end-of-course exam increase from 15 percent to 35 percent; for middle-school students the passage rate rose from 68 to 87 percent. The total number of students passing the test increased from 1,869 in 1997 to 3,583 in 2000 (see Figure 1).

Common Standards

Within the accountability movement, teachers often complain that the states’ new academic standards are not clear or detailed enough to guide their instructional planning. Houston answered their concerns by developing more specific guidance about what students should know and be able to do at each grade level. In January 1995 Houston launched an audit to determine whether the district’s written curriculum was aligned with the state’s guidelines and with what was tested by the state. The audit also investigated whether what was actually being taught in the district’s classrooms was aligned with the standards or tests. The district found that the majority of teachers still used the textbook as the primary resource for instructional planning, rather than the district curriculum or state academic standards.

The district then examined its textbooks to determine the degree to which these were aligned with the curriculum. It found that many of the textbooks were poorly aligned with the curriculum. Not surprisingly, examination of the district’s test scores objective-by-objective showed that students did well in areas where the textbooks were strong and performed poorly in other areas.

In response, the district launched an effort to provide more detailed information to teachers about the material they should cover. When Texas adopted a new set of curriculum objectives (the Texas Essential Knowledge and Skills) in 1997, Houston launched Project CLEAR (Clarifying Learning to Enhance Achievement Results) as a way of establishing uniform standards for student learning across the district. The product is a binder containing an annotated scope and sequence for each grade level or course. For each objective teachers are given detailed information about what content should be taught to meet the objective, the level of knowledge that has been developed in earlier grades, assessment ideas that can be used to determine if the student has mastered the objective, and ways the skills covered by the objective can be linked to other objectives.

In interviews, every teacher I spoke with said that he or she found the annotated curriculum useful. New teachers said they use it frequently and are likely to try the strategies and activities suggested; older teachers use it to make sure they are aware of changes in the curriculum.

When Houston first began talking of curriculum alignment, some saw this as an attempt to cheatâ€”that is, to teach to the test, a manager in the curriculum department told me. Today this is understood very differently. “We want every child in every school to have the opportunity to have a quality curriculum so they can be successful,” another district administrator said. The introduction of a common curriculum was explicitly aimed at equity. “The superintendent feels strongly about all kids achieving academic success; if each teacher decides what the kids can handle, this complicates things,” the administrator said.

“No educated child will fail TAAS,” said one Houston administrator. “These schools do not need to prepare children for the test. Well-educated children will do well without preparation.”

Responding to critics who say that the state standards will cause a dumbing-down of the curriculum, one administrator said, “TAAS is the floor, but we’re trying to make sure that the schools aren’t spending all of their time on the floor, that they are enriching their curriculum. Some of our best schools said that they don’t want to give up on what they are doing, but no educated child will fail TAAS. These schools do not need to prepare children for the test. Well-educated children will do well without preparation.”

Teachers seemed to have somewhat mixed feelings about the standards and accountability policies that drove the district’s effort to align the curriculum. While there is no shortage of teachers who say that there is too much emphasis on TAAS, the regime of standards and tests seems to have grown on many teachers. “You have to set standards. You have to give everyone something to strive for,” one teacher said.

As with the reading initiative, the curriculum-alignment initiative has the power to improve instruction simply by making what happens in the classroom the subject of discussion and critique. In an ordinary classroom, it may be impossible for anyone to monitor the progress an individual teacher is making through the curriculum because the curriculum itself is so flexible. The annotated curriculum provided by the district makes openness and monitoring possible, partly by making crystal clear what a teacher should be covering and partly by making explicit the links between what one teacher does and what other teachers are doing.

Transparent Classrooms

The hardest part of capacity building is not identifying effective instructional methods or putting this information in the hands of teachers, but getting teachers to change what they do every single day. Overcoming the barriers to changing what teachers do will require a transformation in the culture of schools. Paige notes that in the past curriculum and assessment were left to the discretion of each teacher in the district. Teachers came to see themselves as private practitioners, and many of them are now reluctant to relinquish what they see as their professional prerogative to decide what to teach and how to teach it, Paige observes. What Houston’s reading, math, and curriculum-alignment initiatives share is that they open the classroom door and subject the teacher’s daily practice to scrutiny and analysis. In Houston teachers are no longer private practitioners who choose what and how to teach.

While the three initiatives all make that possible, each initiative embodies a very different approach to the exercise of power by a school district. In the case of reading instruction, Houston has taken a decision out of teachers’ hands and provided an answer of its own. The district allows teachers some flexibility in how they teach, but teachers are not free to attempt to teach reading without phonics. In math the district has attempted to improve instruction not by setting a course for schools, but by creating live opportunities for teachers to improve their knowledge and skills. In curriculum alignment the district has simply made resources available to teachers and schools.

How has Houston managed to make its distinctive strategy work? One key was Rod Paige. In identifying areas of practice that may require central control, Paige’s method was to seek the best advice he could find and then follow it. He was also an inspiring leader, as any visitor to the district will attest. Also noteworthy is the district’s willingness to be guided by the evidence on matters both large and small. The district collects mountains of data and analyzes everything that moves.

Taking the middle path between making all decisions centrally and leaving everything to individual schools is not an approach with which purists are comfortable. In particular, those who envision principals as CEOs bristle at the idea of so much outside interference in classroom practice. In reading, math, and curriculum alignment, Superintendent Paige and his staff identified key issues that needed to be addressed and responded boldly. The district was willing to flex its muscles when necessary, but has thus far not yielded to the temptation to make many other decisions for schools. This approach is certainly not for the faint of heart; if a district makes bad decisions about when to intervene, it could be dangerous for good schools that are doing well on their own. But given the reality of the schools we have now and the staff we have in them, Houston’s balanced approach to the role of the district seems a good bet.

-Marci Kanstoroom is the director of research at the Thomas B. Fordham Foundation.

The post Balancing Act appeared first on Education Next.

The Mismeasure of Learning

Lauren B. Resnick — Wed, 19 Jul 2006 00:00:00 +0000

Who would have imagined in 1990 that only ten years later a reviewer would be asked to consider a crowd of books attacking the “established” practice of standards-based education? At the time, the idea of setting public expectations for what schools ought to accomplish rather than regulating the practices of schools and teachers seemed a goal worth fighting for, but not one that was likely to be achieved very quickly. The standards movement was part of an emerging politically centrist coalition: Republicans and Democrats espoused the idea in roughly equal numbers. “Education governors,” many from southern states in need of an achievement boost, joined forces with leading business organizations to promote higher expectations and a better-educated workforce. Civil-rights advocates were initially skeptical, but many saw the potential power of a reform movement that would not brook separate and lower expectations for poor children, immigrants, or racial minorities. Educators, by contrast, were largely absent from the early coalition for standards. Most assumed that this new idea, like so many previous fads, would soon pass, and they went about teaching in traditional ways.

But the idea did not go away. To a remarkable degree, standards-based education has become national policy. All states except one have created statewide standards. Most have developed statewide tests. Some use these tests to create “high stakes” for students (preventing them from advancing to the next grade or graduating) or for educators (taking over underperforming schools, requiring the schools to accept external assistance, or simply shaming them by identifying them as poor schools). Federal funding for education is strapped to requirements for standards and assessments. Citizen groups, business alliances, and nonprofit agencies of various kinds are organizing to support standards-based teaching and learning in the schools in their regions. Agencies that offer services to the schools, such as government agencies and various nonprofit and for-profit companies, promote their programs as ways to improve students’ performance on the tests.

When a policy change this sweeping happens so quickly (a decade is a very short time for a change of this scope, especially in America’s highly decentralized educational environment), it should come as no surprise when critics appear on multiple fronts. The execution of the standards-based educational vision has been uneven in quality, providing plenty of grist for the critics’ mills. Few states have followed through with the curriculum and professional-support programs promised by the early standards movement. In practice, raising standards sometimes looks more like punishing teachers and students than serious educational work. The quality of standards and tests is uneven; the tests are often not aligned with the standards they claim to measure. Finally, longstanding inequities in the distribution of educational resources have not been eliminated, so it is easy to argue that the standards movement is unfair to some children.

Who Should Set Standards?

One divisive issue in the standards controversy is fundamentally political. The question is, Who should make decisions that touch people’s lives? Some want this authority to remain as local as possible, vested in caring, face-to-face communities, not in a distant national or state government or even a downtown school board or mayor’s office. Others believe that representative bodies should set expectations and then “manage by results” instead of attempting to control the processes and procedures of schooling. In Will Standards Save Public Education? Deborah Meier argues that only those who actually know a particular group of students-the teachers in a school, those students’ parents, perhaps a school-level governing board or an advisory group-have the right to set expectations and standards for them. She explicitly rejects the argument that a representative group of citizens, acting from a distance, has a legitimate right to decide what students ought to learn.

Meier’s is a radical local control argument, local down to the individual school. Local control is a popular phrase in American political discourse. It is, by and large, invoked not by the liberal end of the political spectrum (which is Meier’s natural home in other matters) but by conservatives, even the radical right. A few years ago, standards were being attacked by Christian conservatives fearful of the outcomes-based-education movement, which appeared to be succeeding in making a particular set of liberal expectations about attitudes and values the official approach. As a quick way of blocking standards they didn’t like, members of the radical right appealed to the notion of local control, set against the specter of a federally controlled curriculum. At the same time, they promoted various versions of school-by-school choice for parents. Charter schools and home schooling, along with voucher programs, have developed as the political expression of this drive for choice and variety.

No amount of getting standards “right” will make much difference when states and districts are calling for teachers to raise scores on tests that do not match the standards anyway.

Meier, committed as she is to public education, does not want to opt out. She has spent her career building alternative schools entirely within large, urban public systems. In effect, however, she wants the public system to operate as if it were clusters of charters, free to set their own standards within only the broadest definitions of public requirements. Meier seems to be well aware of the odd political company she is keeping in her fight against officially imposed standards. She admits that localism will permit “wrong” (in her view) decisions to be made. But, she says,

I think the benefits of greater autonomy and more local control outweigh its dangers when it comes to schools. My long-term fundamental faith in local democracy, and a recognition that there are sufficient incentives in place-[such as the requirements of employers and universities]-that keep school standards from straying too far afield makes me rest easier than my critics.

Susan Ohanian goes beyond Meier by wholly rejecting external standards, claiming that only a child’s individual teacher can know what is right for that child. Ohanian’s book has little of Meier’s thoughtfulness and no discussion of whether the polity-however local-needs to develop shared commitments. Her title, One Size Fits Few: The Folly of Educational Standards, tells most of what the book has to say. She is concerned especially with individual children who do not fit standard categories. Claiming to speak for her fellow teachers, she portrays them as “pursued” by the boards setting standards and claims that a love of children is the primary (only?) criterion of being a good teacher. This is a breathless polemic, filled with horror stories, from accounts of meetings of state standards committees in Illinois and California to tales of children unfairly prevented from graduating, with passing accusations about the businesses and politicians who will somehow benefit from all of this standardisto activity. It is hard to discern Ohanian’s central argument-other than “Leave us alone; we know what is best for children.” Nonetheless, the book contains some ideas worth listening to, some warnings that responsible promoters of standards-based education systems need to heed. Among these is the likelihood that, in the race to meet standards, schooling will be reduced to following schedules and scripts; that time spent with literature and rich language will be driven out by vocabulary workbooks; that learning science will become merely the memorization of a string of definitions of technical terms; in short, that American anti-intellectualism (on which I have more to say) will triumph in a haze of bean counting.

The Standards Themselves

This brings us to a second great fault line in the theory and practice of standards in education: conflicting ideas about the proper substance of American schooling. Suppose we agree that centralized standard setting is appropriate in America’s representative democracy. Then we need to face the many differences of opinion concerning what schools ought to be teaching, to whom, and even how. Here we run up against another longstanding divide in American life. Visitors from other countries, from Alexis de Tocqueville onward, soon notice the apparent anti-intellectualism of American society. As a people, we don’t seem to value highly the complexities of knowledge, nuances of language, and details of interpretation. We like quick arguments and quick decisions, not extended reasoning. We are a practical people with a taste for getting things done, not talking about them. Ideas for their own sake do not tempt many of us.

This anti-intellectualism is widely reflected in our schools, even in our colleges and universities. The American school has long focused on a basic-skills curriculum, often repeated into high school. Only a minority of top students are exposed to a program that is intellectually challenging. For many, the standards movement represents an opportunity to weave into the fabric of our schools a more intellectually demanding program. We now have scientists, mathematicians, historians, and professors of literature joined with business representatives and educators in debating what students in secondary and elementary schools should be studying. America has not seen this much attention to the content of schooling in nearly a century.

These debates highlight longstanding disagreements over the meaning of “higher intellectual demand.” Traditionalists believe that there is a core body of knowledge that all students ought to learn: mathematical and scientific concepts, historical facts and interpretations, books that are part of our shared American heritage. Traditionalists call for building the curriculum around such concepts and shared texts. Progressives are less interested in the specific knowledge or the texts to be studied; they are more interested in students’ ability to use that knowledge and to find new information when necessary. Progressives often argue that there is too much information to imagine that anyone can master all of it, and they are-by and large-less concerned that everyone share a common core of knowledge.

The execution of the standards-based educational vision has been uneven in quality, providing plenty of grist for the critics’ mills.

Each side readily cites the abuses of the other. Traditionalists can point to empty, process-oriented teaching, where there is little accountability for getting the facts right as long as students are actively engaged. There is plenty of fuel for their attacks, although none of the intellectual leaders of the progressive camp condones such ways of teaching. Meanwhile, progressives can point to classrooms in which students read badly written textbooks or spend too much time memorizing isolated bits of knowledge. Again, such classrooms exist in abundance, but they are not what thoughtful promoters of traditional forms of education have in mind. In large measure, they want to recreate the excitement they remember from their advanced-placement courses and university education.

Alfie Kohn speaks for the progressive point of view, and he halfway succeeds in making a convincing case. In The Schools Our Children Deserve: Moving Beyond Traditional Classrooms and “Tougher Standards,” he argues in favor of schools in which students are intellectually engaged and encouraged to grapple with rigorous problems: schools, in other words, in which correct answers matter, but so does reaching those answers through a complex process that may involve making errors and misunderstanding concepts along the way. Images of such schools are set against descriptions of schools in which intellectual engagement is driven out by rote memorization and motivation by grades and test scores. These are dubbed traditional schools. Kohn knows the research literature, especially on motivation to learn, and generally gets the arguments right: too much working for points and grades reduces the potential pleasure of mastering knowledge and ideas. It also creates conditions in which students view challenging problems or tough courses not as opportunities to learn but as risky endeavors that may harm their chances of graduating to the achievement ladder’s next rung. Kohn also shows that many of today’s standardized tests stress disconnected facts and skills over the more intellectually demanding materials of a genuinely progressive educational diet.

However, Kohn doesn’t grapple with the important role that knowledge plays in the development of intellectual capacity. Research on learning over the past three decades makes it abundantly clear that:

• Knowledge matters. What one already knows is the foundation for new learning as well as expert performance. It pays to build a foundation of knowledge in any area of human competence that is valued. The idea that people can learn generalized skills for learning, reasoning, and gathering information and then get the facts later doesn’t survive scientific scrutiny. At the same time,

• Active processing of information is the only reliable way to acquire knowledge. Attempts to ingest knowledge by repeating lists of facts or drilling on routine procedures don’t work well, not even for the basics of multiplication tables or beginning reading, not to mention historical or scientific concepts. Acquiring information involves active intellectual work, so the idea of teaching the basics first by memorizing and drilling and later moving to “higher order” processes is equally doomed.

These twin, established facts about learning have guided my framing of the concept of knowledge-based constructivism or, in language friendlier to teachers and the general public, academic rigor in the thinking curriculum. The latter is one of nine principles of learning formulated by the Institute for Learning that I direct at the University of Pittsburgh to provide assistance to school systems in building organizational and instructional practices that will enable their students to meet higher achievement standards. As every district and school we have worked with has discovered, recognizing simultaneously the need for a solid core of knowledge and the need for active processing of information is a difficult and delicate task. But our work is providing common ground for traditionalists and progressives, because we are showing them how to hold themselves and their students accountable for both rigorous knowledge and intelligent management of that knowledge.

The standards movement represents an opportunity to weave into the fabric of our schools a more intellectually demanding program.

Tests

The heart of an accountability system lies not in the words of standards documents but in the tests and other assessments that are used to determine whether the standards have been met. In theory, tests in a standards-based system are supposed to be “aligned” with the standards; that is, they are supposed to examine the knowledge and skills that the standards specify. In practice, however, tests and standards are usually poorly matched. Evaluations of the tests and standards of several states conducted during the past two years by Achieve-an organization of state governors and business leaders working to promote high achievement-have revealed none so far in which the tests in use are well aligned with the state’s articulated expectations in math and literacy. The tests often overemphasize skills and knowledge set at lower levels than the standards underlying them. This is partly because many states rely on various incarnations of traditional American standardized tests. Some states build their own tests; some contract with testing companies; and some just adopt commercial tests. Whatever their origin, the tests tend to look alike because they have the appearance Americans expect-they are multiple choice in form, skill-oriented in content-and because they are what we know best how to create. They are also cheaper to administer than essay exams and other forms of performance assessment.

In Standardized Minds: The High Price of America’s Testing Culture and What We Can Do to Change It, Peter Sacks builds a case against such practices. He describes an accountability movement driven by standardized tests. He offers lively, personalized examples of states and school districts (and employers) using test scores to decide whom to promote, whom to graduate, and whom to hire, while often ignoring other evidence of an individual’s competence. These are familiar arguments against testing, cases of individuals who do not “test well” or who fall just below a cut-off score for qualification. But Sacks’s much more important argument concerns the fundamental invalidity of standardized tests for guiding educational decisionmaking. He shows how tests “dumb down” the curriculum by channeling teachers’ efforts and students’ time into activities that mimic the tests (for example, filling in the bubbles on practice tests). He argues, as I did a decade ago while working to launch the standards movement, that these tests help to perpetuate the “mile-wide, inch-deep” curriculum, rather than promoting teaching that pushes for solid knowledge of important topics. Sacks also challenges the supposed objectivity of standardized tests and shows how repeated administration of very similar tests produces test-score increases that may have little to do with real changes in achievement.

Nicholas Lemann, in The Big Test: The Secret History of the American Meritocracy, builds an even more challenging case against the dominance of standardized tests in education. In a compelling account, replete with biographical as well as institutional detail, Lemann traces the history of the founding of the Educational Testing Service in Princeton, New Jersey, and the eventual hegemony of its machine-scored college-entrance exams. The arguments that are now familiar-for example, that tests meant to measure inherited aptitude for higher learning are responsive to “test prep” drills-are developed here in a personalized style that makes it hard to put the book down.

The originality of Lemann’s argument lies less in what he says about the tests themselves than in his account of the vision of an engineered society that the founders and supporters of ETS (James Bryant Conant, Henry Chauncy, and Clark Kerr, to name some of the best known) had in mind. The founders’ vision of a well-oiled meritocracy, with individuals slotted into educational institutions appropriate to their abilities and streamed into jobs where they could best contribute to a growing economy, apparently fit the temper of the country as it emerged from World War II and entered the cold war. Today-more than a generation after the student upheavals, the civil-rights movement, and the establishment (and then challenge) of affirmative-action programs-it seems out of date, too planned, and insensitive to America’s diversity. We still believe in merit over inherited position, but we mistrust technocrats and institutional decisionmakers who attempt to control individual lives.

What one already knows is the foundation for new learning as well as expert performance. It pays to build a foundation of knowledge in any area of human competence that is valued.

The SAT was first used to open the doors of elite institutions to certain middle-class and poor children who wouldn’t have gotten in under the system that favored the old boy network. Lemann shows how the engineered meritocracy later clashed with the country’s expanding recognition of its diversity and the desire of its most selective institutions to represent that diversity in their student bodies. An ultimate irony may well be in the making. To maintain diversity in higher education in the face of legal challenges to the use of different test cutoff scores for different racial and ethnic groups, major public university systems are experimenting with setting aside SATs in favor of high-school grades or other evidence of student accomplishment.

Reviving Standards

Lemann doesn’t like bubble tests any more than do Sacks or the other critics of standards and testing discussed here. His most important proposal entails renouncing the idea that the primary function of schools and colleges is to sort and select, asking that they focus instead on their role as institutions for educating everyone. What would matter then is not which college one attended but rather the fact that one had the opportunity for higher education and learned something important while there. Complex systems for sorting students into college would then be far less important than the quality of what was taught to everyone, both in school and in college. Tests designed to compare students’ aptitude for future learning rather than to assess what they had learned in a curriculum could be allowed to fade away, much as Chancellor Richard Atkinson has recently proposed for the University of California system.

Extended to elementary and secondary schools, such a move might rescue standards. Despite all their differences, the books reviewed here, taken as a group, make it evident that tests have “hijacked” the standards movement. In the early proposals for a standards-based system, the idea was to align all the elements of the education system-assessments, textbooks, teacher training, incentives-to publicly debated expectations of what students should be learning. In practice, however, tests, not standards, have become the centerpiece of accountability. In many parts of the country, educators spend more time analyzing tests and figuring out how to prepare students for them, often by directly teaching sample items from tests, than they do studying and understanding the standards. Where the tests are well aligned to high-quality standards and where they contain enough tasks requiring deep analysis and writing by students, matching teaching to the tests may work reasonably well, at least as a first step in reform. Where the tests are poorly aligned and consist primarily of short-answer, shallow questions-as is the case in far too many states-teaching to the test is educationally dangerous. It will lower real achievement in the name of raising scores.

For all the passion over standards, arguments between traditionalists and progressives about what a rigorous curriculum ought to look like matter less today than most people think. This is because no amount of getting standards “right” will make much difference when states and districts are calling for teachers to spend their time raising scores on tests that do not match the standards anyway. In the past decade, ample evidence has accumulated indicating that it is possible to create reliable assessments that come closer to measuring the kinds of achievement now called for in the most thoughtful state standards. These usually involve a mixture of multiple-choice and open-ended “performance” tasks. Systems for public grading of samples of students’ regular class work would carry the concept of standards-based assessment still further. Such assessments cost more than bubble tests, but they push teaching in the directions intended by the standards. Only a few states are using such assessments today. More need to-or the backlash against testing will almost inevitably kill the standards as well.

-Lauren B. Resnick is a professor of psychology and director of the Learning Research and Development Center at the University of Pittsburgh.

The post The Mismeasure of Learning appeared first on Education Next.