![]() | |
|
Dear People: This e-mail is a long explication, being sent only after lots of time/work on it, of some of the major things that are wrong with the State's School Report Cards, and how those relate to the NCLB. Please consider it in making improvements before the Report Cards for next year. (I strongly recommend that you print this e-mail out for reference.) If you have colleagues who should have received this e-mail and didn't, please forward it to them. Thank you very much. INTRODUCTION: AN EXAMPLE SHOWING SOME OF THE PROBLEMS WITH THE STATE REPORT CARD (& NCLB) To more effectively convey the general points made in this text, I am including as an example the State Report Card data for Phalen Lake Elementary School in St. Paul. I've given some of the highlights from that school's profile next. (However, the last time I checked about three months ago, access to the Phalen Lake data had mysteriously disappeared from the State's website.) Phalen Lake's demographic characteristics include: 88% Free & Reduced Price Lunch, 48% Limited English Proficient, 11% Special Education. The ethnic breakdowns are: 44% Asian, 21% White, 20% Black, 14% Hispanic; 1% American Indians. Almost all of the Asians are Hmong. According to State/NCLB proficiency calculations, Phalen Lake is failing to make "Adequate Yearly Progress" (AYP) in Reading for 5 of the 6 groups for which it is accountable, and in Math for 1 of the 6 groups for which it is accountable. Phalen Lake is obviously a pitiful school that is failing miserably. Or is it? If we look at Phalen Lake's Report Card for 2004, however, AND, IF we assume that the 2004 Phalen Lake 5th graders, when they were in 3rd grade in 2002, scored like the 2004 3rd graders, THEN many of the 5th graders have made some substantial gains in the last two years. I'm using only the numbers for Phalen Lake's 3rd and 5th grades from 2004 here: What would it say about the quality of education at Phalen Lake-- which is clearly among the most demographically "challenged" schools in the State-- if the 3rd graders had gone from 29% proficient in 2002 to 53% proficient in MCA Reading as 5th graders in 2004? And if the same students had gone from 43% to 59% proficient in MCA math in those two years, while many fewer scored at the lowest level in math (31% vs. 8%)? Would those be the results of a failing school, rightly labeled as such by the NCLB and by two stars on the State's Report Card, and rightly deserving of the final NCLB-mandated consequences occur? Will there be a substantially increased likelihood that student performance at Phalen Lake will improve as a result of such consequences? Or might Phalen Lake be a pretty good school, falsely branded as failing by the NCLB and rightly deserving of more than two stars in any fair report card? I do have to say "might" because I have only projected backwards from the data for 2004 here. But I'd still have to say "might" even if I had examined the "trend" data in the back-up portions of the State Report Card. The trend data enables comparisons of data for 3rd graders for 2002 with data for 5th graders in 2004-- but the trend data are NOT reported only for continuously enrolled students at a given school. So I can't ascertain student learning gains with any surety, especially for a school with fairly high mobility. (From District sources, the mobility level for Phalen Lake during the previous school year, 2002-03, was 29%; stability was 88%.) That implies a turnover of roughly 25% of the school's population during the two school years between 3rd and 5th. PLUS, there will have been changes in student clientele over the two intervening summers (through Sept. 30)-- changes that don't count as mobility and don't disrupt the school year much, but changes that may well bring lower-achieving students into a school serving predominantly low-income neighborhoods. So perhaps I'm wrong in surmising that Phalen Lake might actually be a pretty good school. But if I am wrong, what about some of these other St. Paul schools (this time from trend data): Ames: Reading 2002 3rd (24% Proficient) vs. 2004 5th (51%)? Bruce Vento: Reading 2002 3rd (27% Proficient) vs. 2004 5th (52%)? Como Park: Reading 3rd 2002 (26% Proficient) vs. 2004 5th (51%)? Highwood Hills: Reading 2002 3rd (36%) vs. 5th 2004 (55%)? ONE MORE INTRODUCTORY POINT Just as is the case in surveys repeated over time-- my field is survey research-- maintaining a degree of continuity from year to year in evaluation/assessment and reporting is a worthy goal. But when the first iterations of an effort, such as the State's Report Card, are replete with severe weaknesses, it is better to fix it sooner rather than later. Waiting only postpones major changes, makes continuity worse, and makes more years of portions of the data of little value to the record. POINT #1. MINNESOTA NEEDS TO GIVE CREDIT ON STATE REPORT CARDS FOR LARGE GAINS IN PROFICIENCY It is all the more essential to give credit for learning gains because the NCLB will be labeling ("branding"?) as failures (i.e., as not making AYP) more and more elementary schools, including some which produce large gains, because the levels of achievement of one or more subgroups at the schools cannot be ratcheted high enough because the challenges are simply too great to reach the mandated levels of proficiency in the time allotted (e.g., by 3rd grade). Major State initiatives for students who need pre-schooling and English-language learning before kindergarten, and more funding for educational programming after school, would help student achievement considerably in such schools. But if schools with challenges like Phalen Lake are to be judged fairly, learning gains/value-added must be the major factor in those judgments. POINT #2. HOW TO REPORT LEARNING GAINS ON STATE REPORT CARDS-- AN ELEMENTARY EXAMPLE For the NEXT Report Card on schools and districts, the public needs a separate section that shows, for both MCA Reading and MCA Math, the scores for continuously enrolled students who were 3rd graders in Spring 2003 and 5th graders in Spring 2005. At minimum, the percentage who score as "proficient" should be shown, but much more information is conveyed, including the extent of growth beyond level 3 (minimum proficiency), by the current bar graphs showing all five scores. With respect to elementary schools and these learning gains, there is no need to wait for a more perfect system to be researched/developed-- refinements made subsequently should result in a part of the report card system very close to what is proposed here. Moreover, it is unfair to wait, because damage is being done NOW to some school reputations and teacher morale, and parents are being misled as to choice of schools. (Note: The approach suggested here is to show the percentage proficient over time for continuously enrolled students. Differences between new and old tests/standards, and methodological questions involving whether the MCA is designed to measure gains, don't really matter much in our actual context. The reason this is so is that far higher stakes are already associated with school proficiency percentages, and if you can sanction or condemn a school based on the level of achievement indicated by those numbers, you can certainly examine and report its learning gains based on the same data. The State is responsible because it has defined "proficiency" at various grade levels, and that is the basis of NCLB judgments.) (Further note: If State level data cannot be calculated for continuously enrolled students, that too should be fixed. The current "trend" data can be used to provide an estimate but, as described above, trend data would be likely produce under-estimates of student learning gains at schools where mobility is high, where new very low-income residents enter the neighborhood while economically more successful residents leave it.) POINT #3. THE FAILURE TO INCLUDE LEARNING GAINS/VALUE-ADDED IN THE STATE'S REPORT CARD EXACERBATES THE INEQUITIES IN THE NCLB Excluding proficiency-related learning gains and other forms of value-added data from the State's Report Card indicates, in effect, that the State has little or no problem with this omission in the NCLB. Indeed, by not including learning gains in its star-awarding system, the State adds to the negative branding of schools that fail to make AYP, even when students at those schools show large gains. POINT #4. "MAKING AYP" AT HIGHER GRADE LEVELS SHOULD TRUMP "NOT MAKING AYP" IN LOWER GRADE LEVELS IN THE SAME SCHOOL (OR DISTRICT) For example, if the 5th graders are making AYP at a school, the 3rd and 4th graders are probably en route to do so too, and the school should not be branded as a failure. At minimum, the school should be granted safe harbor. State authorities should inquire with federal authorities whether such an adjustment is legally possible now and, if not, should urge that the NCLB be changed in this regard. (Note: For this recommended change to work well without concealing school weaknesses by reducing the size of some demographics subgroups below the threshold of 20 students, some variations on existing criteria would need to be adopted. For example, when conducting a more in-depth examination of schools that make AYP at 5th but not at 3rd, the minimum sizes for subgroups in grade 3 and 5 would need adjustment downward, because the minimum number of students may well not be reached for all subgroups at both grade levels. But because the data for smaller numbers of students won't be as reliable, perhaps another minimum requirement could be that there at least a nominal upward trend in each subject tested for all subgroups numbering 10 or more students in both grades 3 and 5.) POINT #5. TO SHOW THE VALUE-ADDED BY A SCHOOL, THE STAR-AWARDING CRITERIA FOR STATE REPORT CARDS SHOULD INCORPORATE RESULTS OF COMPARATIVE ANALYSES FOR ALL SCHOOLS, USING DATA FROM SIMILAR SCHOOLS AND POPULATIONS, OR A REGRESSION ANALYSIS In this text, comparative performance data means comparisons are made between schools with very similar demographics, or between demographic subgroups within a school that are compared with the same subgroups in more than one other school. Probably the best approach to such a comparative analysis is a regression-based approach. POINT #5A. DIFFERENCES BETWEEN ELEMENTARY & SECONDARY SCHOOLS AFFECT COMPARATIVE ANALYSES; VERY HIGH-ACHIEVING SCHOOLS MAY NEED DIFFERENT TREATMENT (Note: In a regression analysis, "residual gain" is the extent of achievement at a school that is not explained by prior achievement or by demographic factors-- so the residual gain is the extent of achievement presumably attributable to the school. Residual gains are a measure of the extent of value-added. If a school's residual gain is positive, it has, compared to other schools of its kind, added more value to student achievement; conversely, when residual gain is negative, comparatively less value has been added by that school.) ALSO RELEVANT IN ELEMENTARIES: A value-added analysis (i.e., not just learning gains) is relevant in the elementaries too. There is no other measure of how well elementaries do in their first three grades (prior to the first MCAs). A value-added (regression) analysis would look at the elementary schools' value-added (residual gains) after controlling for the relevant demographic factors, such as percentage of students in ELL status, in Special Education, and percentage of students eligible for a free or reduced-price lunch. Most if not almost all of the effects of mobility can be eliminated by eliminating students from the samples who entered their schools after the beginning of kindergarten (or first grade?). IMPORTANT: HOWEVER, different provisions probably need to be made for schools that have FEW educationally "challenging" clienteles, and have VERY HIGH levels of student achievement. In order not to misjudge/disadvantage these schools, sustaining high levels of initial achievement should provide an alternate way to award the stars on State Report Cards that are assigned to value-added (including learning gains). For example, schools in the Twin Cities wealthier suburbs, which have children mainly from higher-income families and which have high achievement levels, should be expected to sustain their relatively high(er) levels of achievement. Thus a school that has 100% proficiency in MCA Reading at grade 3 and has 97% proficiency two years later at grade 5 should be seen as having fully sustained its level of performance (because of ceiling effects and the limits of reliability). But a decline from 95% proficiency to 85% proficiency might cost a school a star. I have used proficiency levels as an example here, but a better measure might be the average scores obtained over time in Math or Reading. POINT #6. THE SYSTEM FOR AWARDING STARS IN THE STATE'S REPORT CARD IS UPSIDE DOWN Although the NCLB mandates public reporting of school performance by the States, the State should, insofar as legally permissible and evaluatively sound/reasonable, uncouple its "star" rating system from NCLB requirements. Because the NCLB's approach only looks at student achievement levels (disregarding both challenge factors and student learning gains/value-added), it is a methodologically erroneous and unfair way to judge school quality, the moreso because it is tied to a system of sanctions. (Indeed, the recent federal emphasis on experimental research methods as THE way to identify educational "best practices" stands in sharp contrast to the rash presumptions involved in how the NCLB calculates AYP, makes judgments, and applies sanctions.) AYP calculations are already reported in the State Report Cards, so why should the State reiterate the same erroneous approach, over-and-over again as it has been doing (as we've seen above), in its star-awarding system?? Instead, the differences between the State's carefully reasoned and analyzed approach to judging school quality and the NCLB's simplistic and unfair judgments should help to highlight the weaknesses in the NCLB and increase the pressure to improve the law. Indeed, perhaps the main reason to retain the stars as part of the State's School Report Cards is to seek to show the truth about school performance. A DUAL SCHEMA FOR AWARDING STARS ON STATE REPORT CARDS (EXAMPLE) Although this schema will need refinement, it is as close to fair as I could make it, evaluatively speaking (while still leaning a little toward NCLB-related perspectives), so the State's final version should look a lot like this schema. The dual track illustrated below divides schools into those making AYP and those not making AYP, but not counting Special Education students in either case. Also, within the track that is making AYP, I have not tried to illustrate criteria or "decision rules" that might pertain to demographic subgroups, but some manner of including that information in the schema should also be devised. In the explanations below, "value-added" is meant to include both directly measured learning gains (e.g., gains in the percentage of students who are "proficient," or gains in average proficiency scores) AND the results of a regression analysis (or some other form of comparative analysis). As is the case currently, a separate grade needs to be calculated for math, reading, etc. TRACK 1: MAKING AYP (excludes Special Education) 5 Stars: Very high to stellar level of achievement that is sustained from year to year; or 4 Stars: High level of achievement that is sustained from year to year (and value-added near 3 Stars: Medium level of achievement that is sustained from year to year (and value-added near 2 Stars: Medium level of achievement but significant decline from previous year and value-added 1 Star: Medium level of achievement but substantial decline from previous year; or TRACK 2: NOT MAKING AYP (also excludes Special Education) 5 Stars: Stellar value-added (e.g., an average 25 percentage point gain in proficiency from grades 4 Stars: High overall value-added (e.g., 15-24 percentage point gain in proficiency from grades 3 Stars: Positive overall value-added (e.g., 5-14 percentage point gain in proficiency from grades 2 Stars: Value-added near average (e.g., -5 to +4 percentage point change in proficiency from 1 Star: Value-added significantly to substantially below average (e.g., loss of -6 or more IMPORTANT NOTES:
POINT #7. NCLB JUDGMENTS BASED ON ONE SMALL SUBGROUP ARE QUESTIONABLE While the subgroup-oriented analyses mandated by the NCLB are extremely valuable in terms of forcing schools to pay attention to all of their subgroups, to achievement gaps, etc., perhaps there should be a safe harbor-like status for schools in which only one small subgroup fails to make AYP-- especially the Special Education subgroup (or if an ELL contingent consists of many students relatively new to the country). However, my statement here assumes that proficiency levels, or learning gains/value-added, are otherwise at least average in comparison to similar schools. POINT #8. NEVER 100% IN ALL OF HISTORY? And as we saw above, even some good schools will get branded as failures early on, and it won't be long now before these schools too (just like some schools that aren't so good) get reconstituted as part of NCLB's consequences-- to the detriment of their student learners. What good does that do? Although proficiency LEVELS should remain ONE of the foci of a revised NCLB, there are more reasonable and realistic approaches-- given the extremely large differences in student demographics across the State and country, and from urban to suburban to rural areas-- and given the powerful general effects of low-income status, mobility, and having other than English as a home language. The NCLB needs to be revised to include much more emphasis on value-added/learning gains, mainly for continuously enrolled students. (And, where achievement is high by 3rd grade, the emphasis should be on maintaining those high levels of performance over succeeding grades.) And perhaps a more realistic NCLB goal would be 80-90% proficiency by 8th grade, counting only those students who have been continuously enrolled in a district for several years. Even among students who are continuously enrolled, there will always be some who have major personal, psychological, social, medical or family problems, who lack motivation, who rebel against authority and the regimentation of schooling, who get too involved in their social life, their work life, or with the law, who aren't very intelligent but aren't classified as Special Education, etc. Thus there will always be some students for whom achievement in one or more required subject areas will not match the State's schedule for attaining proficiency. This country has long recognized this situation in its design of social/educational institutions adapted for life-long learning. Jeff Koon, Ph.D. Background on Jeff Koon:
Jeff can be reached at 651-647-9199. | ||||||||||||||||||||||
Subscribe to the Parents United e-list. | |||||||||||||||||||||||