Finding the Flaws in Claims about School Choice: What Do We Really Know About School Choice and Student Outcomes

Posted on

School choice—a resounding success! Or is it?
Across the nation the popular rhetoric used to describe school choice is glowing. Describing Connecticut’s choice system, newspaper headlines proclaim, “[Choice programs are] a major contributor to closing the achievement gap” 1 and “Students [in school choice programs] are improving each year!” 2. The much talked about full-length documentary, Waiting for Superman, holds charter schools and parent choice up as the last hope for our urban students to succeed 3. But in reality, many of these assertions are made based on a faulty comparison. The current rhetoric used in the public sphere about choice schools and student performance is not accounting for the fallacy of selection bias.

Measuring the achievement impact of choice schools compared to traditional public schools on students is very difficult. The only true comparison would be to take advantage of a parallel universe in which one could compare students who attended a choice school in one universe with the very same students who simultaneously attended a public school. If this technique was possible, many researchers would be out of a job.

This article points out the flaws in many evaluations of choice schools, and highlights several ways to mitigate and improve school choice analysis. Additionally, using a robust data set, I provide original analysis that accounts for some these issues and situate the findings in a broader context.

Selection bias—the problem that plagues all school choice studies
To investigate the effect that school choice has on student outcomes, researchers leverage statistical tools to try to make the most accurate comparison. The issue we are most concerned about when trying to make this comparison is selection bias. Selection bias occurs when the population of students you are looking at is not random but is self selected. In the school choice debate, we worry about selection bias when the families who chose to apply and attend a charter school are even slightly different that the families who just end up keeping their kids in traditional public schools. The problem arises when we try to compare these two groups. It may be that the difference we observe in test scores is really due to the dissimilarity in the family characteristics rather than in the effectiveness of choice or traditional public schools. Herein lies the challenge: How do we make a true comparison of student outcomes between choice schools and traditional public schools?

Virtual twin method—one way to minimize the impact of selection bias
The CREDO team at Stanford University came up with a method called “virtual twin” to try to make better comparisons. The CREDO reports uses measurable student characteristics and prior achievement to match students in charter schools with students who attend public school in their same school district. For example, CREDO compares two students with similar prior test scores both coming from low income and high parental education families, but one student now attends a charter school and the other attends a traditional public school. They do this with many pairs of students or “twins” to curb selection bias and make a better comparison between the two school types. Using this methodology in 2009, the CREDO team found that only 17% of charter schools outperformed traditional public schools, while 46% did worse, and 37% had no statistical difference. 4 They repeated this study on a slightly larger sample of students in 2013 and found that charter schools on average performed slightly better than in the 2009 study 5, but that at the end of the day, an average charter school is just average.

Click start, and use the arrows to navigate through the graphic.

The virtual twin methodology is not perfect, because not all factors can be matched. There still may be some unobservable differences between students who attend charter schools compared with their public school peers. For example, a family that takes the time and effort to apply to a charter school, might be more involved in their student’s education than a family that just sends their student to the neighborhood school, and that might be why we see choice school students performing better than the traditional public school students.  In other words, the result may be driven by the unobservable characteristics of the students who attend charter schools, rather than the actual effect of the charter school themselves.

Randomization—another way to address the problem of selection bias
Using another method to mitigate the issue of selection bias, some researchers take advantage of the randomization inherent in a charter school lottery. When charter schools receive more applications than spots available they are required to hold a randomized “lottery” to determine which students receive a spot. In a large study of charter schools, Gleason et al, (2010), compared the achievement of students who won charter lotteries and attended charter schools to students who lost charter lotteries and attended traditional public schools. Since the lotteries are random, we assume on average, there is no difference between the people who won and lost the lottery 6.

Click start, and use the arrows to navigate through the graphic.

Randomized trials are the closest one can get to a perfect comparison. The methodology helps mitigate the selection issue present in the CREDO study, since the student population they are comparing, the winners and losers, both have the unobservable characteristics that lead to a family applying to a charter school. The Gleason study finds, on average, that there is no statistically significant impact of charter schools on student achievement. Similar to the CREDO studies, Gleason reports positive outcomes for students with low-SES backgrounds. But even this study with randomized design has it’s limitations. For example, only schools that receive more applications than spots use a lottery, therefore the charter schools analyzed in this study were charter schools that received lots of applications, potentially meaning they were on average better charter schools.

Big Data Analysis—a third method to account for selection bias
I set out to find a different method to add to the current understanding of the effect school choice has on student outcomes, taking into account the main issues involved in investigating student outcomes, including selection bias and the unobserved factors that come with it. Increasingly, researchers are collecting data about students over time, in what are referred to as longitudinal studies. These studies often involve capturing data about large numbers of students via surveys, resulting in large data sets. I decided to use one such data set, from the High School Longitudinal Study of 2009.  By using a variety of variables focusing on student achievement, family background, and school characteristics from the High School Longitudinal Study of 2009 (HSLS:09) I wanted to see if I could shed light on the school choice debate.

The HSLS data set is comprised of nearly 24,000 9th graders selected randomly from 944 schools. Students, parents, teachers, administrators, and counselors are all surveyed to collect a wide variety of data on both the students and their learning environment. Called multi-level surveying, this data, in concert with students test scores provided a rich data set for analysis. For an extended explanation of these data, click here.

One of the main issues with using survey data is that it is impossible to account for every potential factor that determines student achievement. In order to isolate the true effect of participating in a school choice program, it’s necessary to hold constant every other potential difference between students. This is obviously an impossible task, especially considering the many unobserved and unmeasurable factors that are present, such as differences in student motivation or innate ability. However, there are analytical and statistical strategies to help control for these difference and isolate the true relationship between school choice and student achievement. I used a variety of these student, parent, teacher, and school controls to try to measure the underlying components that affect a student’s test score.

Assumptions
I set out, assuming that five factors are most important in determining student achievement.They are: 1) whether or not a student attends a choice school, 2) students’ demographic characteristics, 3) students’ motivation, 4) a student’s parental characteristics, and 5) a students teacher characteristics. If we had the data and could measure all of these underlying factors, we could make a convincing case for the accuracy of our estimates to truly measure the effect of choice on student achievement.

Unfortunately, many of these underlying constructs are unobservable, not measured, or have layers of complexity. To mitigate this issue, I used factors I could measure that get at the underlying construct and are highly correlated with these unmeasured factors. For example, when looking at student motivation, I controlled for whether students think getting good grades is important and whether students think they will graduate from college. The hope is that high student motivation, an unobservable characteristic, will overlap sufficiently with students who think getting good grades is important and expect to graduate from college to serve as a proxy. For results to be reliable, these relationships need to be highly correlated but not necessarily perfectly correlated. This is because when working with a very large number of students, as I was, one can begin to see that on average these factors will account for motivation. The rest of our proxies are displayed in table 2. Click on each underlying construct for a deeper examination of the variables used to measure them. Click here for a full breakdown of the model used in estimation.

Click to read more on how each construct is represented: true achievementschool choicefamily backgroundstudent motivationparent characteristics, or teacher characteristics.

The Tricky Bit—How to Account for Selection Bias
In the context of these data and techniques, how did I compare students in choice schools to students in traditional public school knowing that that difference in decision might be because of some unobservable characteristic obscuring the true comparison between choice students and traditional public school students?

My hypotheses going in to this study is that when first looking at choice schools on student achievement I would see a positive effect because of selection bias; I expected that the students in choice schools would be systematically different from those in traditional public school due to parental factors that affected their selection of a choice program. However, after explicitly controlling for parental characteristics, and making a much more valid comparison between students in both types of schools, I expect the initial positive result will not persist.

To control for this confounding factor, I used a variety of controls to account for a wide variety of parent involvement. I considered whether or not a parent attends any meeting at the school, a parent teacher organization meeting, or a parent teacher conference. I took into account whether a parent volunteers at their child’s school or helps fundraise for the school. I also considered parents’ expectations of how far they think their student will get in school, and whether or not they help their student with their homework. My assumption was that together all of these variables account for and overlap sufficiently with the unobservable characteristics that choice school families have that would affect student achievement. Although these factors do not directly account for the underlying construct I argue that these characteristics would signal and proxy for the unobserved ones.

The strength of this approach is that it addresses the issue that comes in to play with the Virtual Twin methodology—selection bias, and it gets around some of the main issues of randomization including only looking at over-subscribed schools. The weakness of the method I used is needing to rely on my proxy strengths without being able to actually tell if they sufficiently account for selection bias. I argue that the above variables account for enough of the underlying factors of student achievement for our results to be unbiased.

Click here for the descriptive statistics of all the variables used in estimation.

Findings
Using data from the High School Longitudinal study of 2009 (HSLS 09) and the above methodology, I indeed found that when initially looking at the relationship of participation in a school choice program and student learning, there exists a positive effect for students of low socioeconomic status. This result explains some of the promise and glamour that the idea of school choice receives. However, after using more robust methods and explicitly controlling for the difference in students and families that chose to attend choice programs, the once promising result, disappears.

To arrive at this conclusion I first compared the achievement of students who went to choice schools to that of students who went to traditional public schools while accounting for their race, socioeconomic status and intrinsic motivation. I found that attending a choice school had a positive impact on students from low socioeconomic background. Results based on simple comparisons like this are constantly held in the media as evidence of the positive impact of school choice. To account for the issue of selection bias and the potentially unobserved parent characteristics as the possible reason choice students appear to perform better in my first comparison, I next also accounted for the parent-related variables. Using these controls, I found that, on average, students in choice school perform no better than students in traditional public schools. This result confirms my hypothesis and corroborates other literature indicating that after accounting for selection bias, on the whole choice schools do not outperform traditional public schools. Lastly, when accounting for teacher quality, the results remain the same. Click here to see the full table of regression results.

In summary, looking at the simple relationship between choice schools and student achievement, I found a positive effect of choice schools, consistent with popular claims made in the headlines. However, when accounting for the observed and unobservable differences in data, these once promising results do not persist.

The Limitations
There are limitations to this study. Without random assignment there is no way to be sure that we fully accounted for selection bias. I can make an argument, and I hope that I have, that my methodology accounts for selection bias, but we will never know for sure. One indicator that this study may sufficiently account for selection bias is that its results are consistent with randomized studies on schools choice that also find no relationship between choice and student outcomes 7 8.

Additionally, it is worth noting that this study looks at choice schools on average. This does not mean that no choice schools are outperforming traditional public schools. Rather, it means that as a whole the choice school reform movement is not outperforming the status quo of traditional public schools. Further, this paper also does not distinguish between types of school choice. Because of data limitations charter schools, magnet schools, and voucher programs were clumped together.

Click for more technical limitations and solutions such as, missing valuesattrition, and other data issues.

The Implications
With school choice becoming increasingly popular among reforms it is crucial to investigate its actual effect on students. Although there is a large body of existing research, it is important to keep looking for pieces in the solution to bring better educational opportunities to students as policies shift and school systems progress. A single assessment of the choice system alone will not provide enough evidence on it’s own, but using an abundance of data and a range of techniques, we can continue to fill in more and more of the picture.

Next time reading about a school choice success, don’t accept the result outright. Make sure to consider the comparison they are making, and ask: Are these two groups are equivalent? Has the study sufficiently accounted for the unobservable differences between students in choice schools and students in traditional public school?

Notes:

  1. Ken Imperato et. al., “Choice Program Data and Emerging Research: Questioning the Common Interpretations of Publicly Reported Indicators of Choice Program Success” (Magnets in a School Choice Arena, Goodwin College, East Hartford CT, December 12, 2013),http://www.goodwin.edu/pdfs/magnetSchools/Kenneth_Imperato.pdf.
  2. De La Torre, Vanessa. “Hartford ‘Sheff’ Students Outperform Those In City Schools,” September 12, 2013. http://articles.courant.com/2013-09-12/community/hc-hartford-sheff-scores-0913-20130912_1_open-choice-sheff-region-hartford-students.
  3. Guggenheim, Davis, Billy Kimball, Lesley Chilcott, Bill Strickland, Geoffrey Canada, Michelle Rhee, Randi Weingarten, et al. 2011. Waiting for “Superman”. Hollywood, Calif: Paramount Home Entertainment.
  4. Center for Research on Education Outcomes (CREDO). 2009. Multiple Choice: Charter School Performance in 16 States. Stanford, CA: CREDO.
  5. Center for Research on Education Outcomes (CREDO). 2013. National charter school study 2013. Stanford, CA: CREDO.
  6. Gleason, Philip et. al. The Evaluation of Charter School Impacts: Final Report. NCEE 2010-4029. National Center for Education Evaluation and Regional Assistance, 2010. http://eric.ed.gov/?id=ED510573.
  7. Bifulco, Robert, Casey D. Cobb, and Courtney Bell. “Can Interdistrict Choice Boost Student Achievement? The Case of Connecticut’s Interdistrict Magnet School Program.” Educational Evaluation and Policy Analysis 31, no. 4 (December 1, 2009): 323–45. doi:10.3102/0162373709340917.
  8. Gleason, Philip, Melissa Clark, Christina Clark Tuttle, and Emily Dwoyer. The Evaluation of Charter School Impacts: Final Report. NCEE 2010-4029. National Center for Education Evaluation and Regional Assistance, 2010. http://eric.ed.gov/?id=ED510573.

Technical Document

Posted on

Data:
The data set used in this paper is the High School Longitudinal Study of 2009 (HSLS:09). This data is a longitudinal national survey that follows individual high school freshman through school and on to further educational pursuits and/or the work force. The sample was comprised of 944 schools, where administrative and support staff, over 23,000 students and their parents, and one math and one science teacher for each student were questioned. The schools were selected first (randomly), then 9th graders were randomly selected within those schools. Students were first surveyed in the fall of 2009, the base year, by administering cognitive math and science tests, logging experiences, and recording aspirations. The survey followed a multilevel model, which collects information by questionnaire from multiple sources such as students, their parents, their teachers, their librarians, and their schools. Two and a half years later (11th grade), in spring 2012, the same students were re-tested, and a new round of information was collected. A short round of data collection focused on transcripts and college planning happened when these students graduated in the spring of 2013, however this data is not yet available. The Fourth data collection is scheduled to take place in 2016, which will collect information about postgraduate trajectories, earnings and employment information, postsecondary education outcomes, and lots more. The study culminates in 2021 where these same students are interviewed about transitions to the labor force, education attainment, and future plans (NCES, 2009).

Model:
This project suggests, using the existing education literature, that the true underlying high school math achievement is modeled as follows. We represent math achievement, Ai, by students i ‘s school type (choice school or not), ʗi , family background, ɲi ,student motivation, Ƴi, parent characteristics and support, Բi , and teacher quality, Ți . The hypothesized underlying achievement function can be expressed as:

  Ai =Ɓ0 +Ɓ1(ʗi)+Ɓ2(ɲi)+Ɓ3(Ƴi)+Ɓ4( Բi)+Ɓ5(Ți)+Ɛi        (1)

where ,Ɛ, is mean zero normally distributed error.

It is important to note here that this model is capturing aspects about the student themselves, their school, and arguably most importantly their home life. In the true model these vectors would be filled with variables containing accurate data. However in reality many of these desired variables are unobserved or not measured. This leaves us with the option of omitting unobtainable variables resulting in substantial bias, or measuring what we can and using proxies for the rest. If our proxies are good we can obtain unbiased and consistent estimates. This project fills each one of these vectors with variables keeping in mind this goal.

Dependent variable
Math scores were tested on two occasions, first in the base year of 2009 and again two and half years later in 2012. The test administered uses what is known as a math item response theory (IRT). IRT rates the difficulty for each item by comparing the likelihood of students correctly getting some items against others. After the difficulty of all the items has been set, the ability of each student is estimated even if the actual assessments are different (NCES 2009). This allows us to compare across different types of tests. These scores are not integers because in the IRT calculation the probability of a correct item is given, instead of simply just counting right and wrong answers.

The dependent variables is mathgain. mathgain is the sample member’s gain (or loss) in their math IRT estimated right scores between the base year (9th grade) and the first follow up (11th grade). This measurement of test score gain is critical in controlling for innate cognitive ability. Differencing creates test score gains. Therefore ability is controlled for because it was the students own past test score subtracted from their current score, essentially subtracting out innate ability for all students. This is possible because innate ability is time invariant, so by differencing the test scores we clean out the inherent ability each student has. This is a crucial step for most education analysis, because omitted variables correlated with independent variables can cause substantial bias in our estimates. Using test score gains also allow us to move past the limitations of using static cross sectional data and take advantage of the longitudinal data structure.

Independent Variables
-choice school
The main focus of this project is examining the effect that attending a choice school (charter, magnet, or voucher program) has on student achievement. Whether or not a student attends a choice school is measurable. choice_school is a variable equal to one if the school the student attends participates in a public school choice program. School choice program is defined in the question as a magnet school, charter school, or school voucher program. This question is answered by a school administrator, which in most cases is the principle. This fact is important because many students, parents, and even teachers might not actually know how how their school is classified in terms of choice programs.

In addition to this dummy variable, this project also includes two interaction terms to dig deeper into the relationship of choice and student achievement. The literature suggests choice schools might not have impacts on student achievement as a whole, but they have shown to be beneficial to students of color and students coming from more disadvantaged families (Gleason et. al, 2010). choice_stud_col is the interaction term between choice and minority, to see if there is a differential effect of choice when the student identifies as black, hispanic, or mixed. choice_ses is included to see if attending a choice school has if you are also from the bottom two socio-economic background quartiles. Together these interactions help investigate key details from the choice literature.

-family background
In order to control for family background this project uses a socio-economic status construct. This composite variable is comprised of parent/guardian’s’ education, occupation, and income. The environment outside of school is very important in modeling achievement. The seminal Coleman report highlights this fact, finding 80 percent of variation of achievement was within schools and only 20 percent between school, suggesting the majority of student outcomes are determined by the difference between student rather than the schools they attend (Coleman, 1966). It continues to be the case that SES is one of the strongest predictors of student achievement.

The race of the student is included in this paper as a control. stud_col is equal to one is the student identifies as Black/African-American, hispanic, or more than one race and is equal to zero if the student responds with anything else.

Student ability is controlled for by using the math gain scores as the dependent variable. Because innate ability is time invariant, differencing the 2009  test scores with the 2012 scores we can clean out the effect of innate or inherent ability each student had. For more information see, the dependent variable section.

-student motivation
In order to control for students intrinsic motivation, which surely has an impact on student achievement two proxies are used. goodgrades is a dummy equal to one if students strongly agreed with the statement good grades matter to them and zero if they agreed, disagreed, or strongly disagreed. Because we are looking at student motivation, only students who strongly agreed with the statement would be the ones we would consider motivated. This allows for a little more variation and tries to distinguish between students more effectively.

Second, this project includes a continuous variable, student_expect that measures student expectations in terms expected of years in school. Students were asked how far they expected to go out of 11 options: drop out of high school, graduate high school only, attend a 2 year college or university but not finish, graduate from a 2 year college or university, attend a 4 year college or university but not graduate, graduate from a four year college or university, start a master’s degree, complete a masters degree, start a PhD, MD, or other professional degree,  or complete  a PhD, MD, or other professional degree. These variable are included to gauge and control for student motivation on student achievement.

-parent characteristics
In order to account for selection bias, this project attempts to control for the underlying differences between parents that might affect student achievement and also whether or not a student attends a choice school. Nine variables from the parent questionnaire are included to do this. First, six dummy variables equal to one if they responded yes and zero if they responded no to the following questions: Since the beginning of the school you have you attended a general school meeting?, a parent teacher organization or association?, a parent-teacher conference?,  a museum?, served as a volunteer?, and participated in school fundraiser?.

Next, this project includes a continuous variable, parent_expect that measures parent expectations of their student in terms expected of years in school. If parents have high expectations for their child we expect they will be involved and attempt to help their student do well more than those with low expectations of their future education attainment for their child. Parents were asked how far they expected each student  to go out of 11 options: drop out of high school, graduate high school only, attend a 2 year college or university but not finish, graduate from a 2 year college or university, attend a 4 year college or university but not graduate, graduate from a four year college or university, start a master’s degree, complete a masters degree, start a PhD, MD, or other professional degree,  or complete  a PhD, MD, or other professional degree. Each option was assigned a number of years in school to create a continuous variable. The literature stresses the importance of labels and expectations on student performance.

The variable langnoteng equal to one if the  language spoken in the student’s home was not english and equal to zero otherwise, was included to control for the differences that face students when english is not spoken in the home.

Lastly, the dummy variable help_hw was included to control for parent involvement and help on school work. help_hw is equal to one if the parent help their child with homework every week and equal to zero if they never do or help less than a week. It is worth noting that parents are not the most trustworthy when talking about their parenting practices. Often they answer more as an ideal than a reality. In all the above proxies do a strong job of controlling for unobserved parent characteristics we want to control for in order to move closer to a valid comparison.

-teacher quality
A variable that controls for teacher quality, or perceived teacher quality is added to the independent variables. Teacher quality is very difficult if not impossible to measure.This paper includes a continuous variable, teach_expect which is a composite variable developed by the HSLS 09 that aims to capture a teachers perception of their peers expectations of students.This composite is composed of questions that get at teachers expectations of students ideas, futures, and performance. Knowing that teachers can evaluate other teachers well, this project uses this composite to control for teacher quality.

In addition, teach_expr, a continuous variable that measures the number of years a teacher has taught math is included to control for teacher quality. Experience and quality have been shown to be correlated (Murnane & Phillips, 1981; Klecker, 2002).

A dummy variable teach_degree was included to continue to get at teacher quality. A teachers credentials and qualifications have shown to be one of the biggest factors in determining teacher quality (Darling-Hammond, 2000).  teach_degree is equal to one when a teacher has received an advanced degree and equal to zero if they have not. The idea being that teachers who completed degrees signal a variety of characteristics, such as, perseverance, the extra effort, and hard work.

Finally, several dummy variables were included to account for student perceptions of their teachers. Students often have a good idea if they have a “good” or “bad” teacher, these variables are an attempt to take advantage of that fact. The dummy variables were coded equal to one if the student agreed or strongly agreed with the question and to zero if they disagreed or strongly disagreed. Dummy variables were made for the following questions: does your math teacher value and listen to student ideas?, treat students with respect?, think all students can be successful?, make math interesting?, and make math easy to understand? These controls are attempting to get at teacher quality. These are crude controls but this project argues that these factors all together do a decent job accounting for teacher quality.

Descriptive statistics
Before Multiple Imputation

Variable

Observations

Mean

SD

Minimum

Maximum

mathBY

21444

40.18

11.97

15.8527

69.93

mathF1

20594

67.22

19.21

25.0057

155.1

choice

17754

0.209

0.406

0

1

ses

21444

0.054

0.78

-1.9202

2.88807

ses1

21444

0.16

0.366

0

1

ses2

21444

0.172

0.378

0

1

ses3

21444

0.197

0.398

0

1

ses4

21444

0.212

0.408

0

1

stucol

22409

0.363

0.481

0

1

studentexpect

16813

19.34

4.16

11

25

goodgrades

21062

0.592

0.491

0

1

anymeeting

15525

0.83

0.375

0

1

ptomeeting

15492

0.383

0.486

0

1

ptconf

15480

0.569

0.495

0

1

volenteer

15519

0.305

0.46

0

1

fundraise

15513

0.53

0.499

0

1

museum

15448

0.534

0.498

0

1

hwhelp

15711

0.483

0.499

0

1

pexpect

21658

11.06

13.04

11

25

langnoteng

15985

0.219

0.413

0

1

teachexpect

1524

0.118

0.954

-5.13

1.29

advdegree

17067

0.505

0.499

0

1

experience

17020

10.14

8.48

1

31

tvalues

18973

0.855

0.3351

0

1

ttreats

18964

0.914

0.28

0

1

tsuccess

18905

0.922

0.267

0

1

tlisten

18933

0.883

0.32

0

1

tinterest

19936

0.629

0.482

0

1

tmatheas

18939

0.74

0.438

0

1

After Multiple Imputation

Variable

Mean

Standard Error

mathBY

39.96

0.08

mathF1

66.42

0.13

choice

0.209

0.003

ses

0.041

0.005

ses1

0.159

0.002

ses2

0.174

0.002

ses3

0.202

0.002

ses4

0.215

0.002

stucol

0.364

0.002

studentexpect

19.11

0.003

goodgrades

0.587

0.033

anymeeting

0.7795

0.003

ptomeeting

0.322

0.01

ptconf

0.552

0.007

volenteer

0.254

0.011

fundraise

0.483

0.009

museum

0.462

0.006

hwhelp

0.501

0.009

pexpect

11.09

0.086

langnoteng

0.158

0.008

teachexpect

0.111

0.007

advdegree

0.504

0.003

experience

10.09

0.064

tvalues

0.851

0.002

ttreats

0.911

0.001

tsuccess

0.92

0.001

tlisten

0.881

0.002

tinterest

0.628

0.004

tmatheas

0.736

0.003

Missing data
A big issue with survey data is missing values. Sample members taking the survey would sometimes leave answers blank or respond that they did not know.This issue would be okay if we knew the missing values were random, however most often there is systematic reasons for why some people leave answers blank thus presents bias into the model. One way to deal with this issue is the imputation of missing values. This method uses other characteristics about the respondent to impute an estimation of the missing values. The problem with imputation is that it tames the data, reducing outliers and reinforcing the means. In this project, multiple imputation is used to deal with missing data. Multiple imputation has the benefit of imputing missing values, however, because it creates multiple values for a given imputation, it reintroduces randomness, avoiding over precision that is caused by standard imputation.  For all variables with missing values, the project uses STATA implementation of the Monte Carlo Markov Chain (MCMC) multiple imputation algorithm that generates five plausible values for each variable based on non-missing values for every other variable. The random seed that was used was 12061992. The analyses were replicated for each of the five imputed data sets and the final coefficients and standard errors were merged using Rubin’s Rules. Using multiple imputations is ideal because it sufficiently mitigates non-random missing values by imputing missing values.

Attrition
If we are using student test score gains, which are measured in 11th grade, should we be concerned that some students in the sample went to choice schools for the beginning of high school but then transfered or vice versa? If this is the case, we would expect there to be bias in our results because it would ether under or overestimate the true effect. Luckily the HSLS of 2009 collects data on which students transfer. Looking at the conditional descriptive statistics, of the 3,290 student who were enrolled in choice programs, none of them ended up in different schools by 11th grade. It is worth noting that 334 survey members did not respond and for 90 respondents the question was no longer applicable. We will assume this is not an issue.

Econometric issues
To represent many of the variables in our underlying model we used plug-in proxies because the true variable is unobservable. When a variable is unobserved it is hard to know for sure the strength of the proxy. If the proxy is weak, it could be that case that it should not even be included in the model, a reason why we may see little significant on some of our variables. This could be particularly true for some of the variables trying to represent student motivation, parent characteristics, and teacher quality. The variables included were indicators of the underlying process but not the process itself.

Measurement error is another way to look at our proxy strength problem. For example, let say there is some variable we are measuring, and the measurement of that variable contains a certain amount of error. We can think of this as measurement error because unlike many proxies, we care about the specific estimates of these variables. Under the CEV if the covariance of our true variable and our measurement error is equal to 0 then although our estimates will be biased, this bias will be in the downward direction. This is an unfortunate reality of education research there are some many unobserved variables and good strong data is rare. This measurement error can be considered a data problem.

Another data problem is which is cut from the same wool is the data at hand was not designed or collected with the specific questions of this paper in mind. For that reason the data is applied to the model, instead of collected for the model. This results in many compromises when doing analysis.

Maybe most influential on our statistical significance and explanatory leverage, is just the general complexity in student outcomes. Education achievement in general is a very stochastic process; there is a lot of natural randomness. These are just a few more reasons why education data is hard to work with.

Another issue could be clustering. This is common in education research. Students are nested within classrooms, schools, and communities. It is possible, and even probable that there are unobserved factors affecting students who, for example are all in the same classroom that has no heat, no computer, and a power plant directly outside. We can not assume all of our observations to be independent because they share conditions at the school and classroom level. The school codes were suppressed in the public data file so we were not able to carry out HLM or even clustering our standard errors at the school level to account for the nested structure of the data.

 Results Table
Students IRT Math Score Gains Estimated by Ordinary Least Squares

Model1

Model2

Model3

Model4

Model 5

choice

0.219

0.378

0.362

0.412

0.447

1

2

3

4

2

choice_SES

1.045*

1.068**

0.78

0.792

6

7

8

[.417]

choice_race

-0.894

-0.973*

-0.842

-0.843

9

10

[.452]

11

ses

4.109***

3.439***

3.197***

3.089***

12

13

14

15

stucol

-1.01***

-1.183***

-1.32***

-1.299***

16

17

18

18

studentexpect

0.328***

0.261***

0.253***

20

21

21

goodgrades

1.303***

1.051***

0.963***

23

24

24

anymeeting

0.489

0.494

26

27

ptomeeting

-0.509*

-0.517*

1

29

ptconf

-0.233

-0.254

30

31

volenteer

0.755*

0.693

32

33

fundraise

0.470*

0.438

34

35

museum

0.707**

0.693***

36

16

hwhelp

-2.136***

-2.05***

35

39

pexpect

0.044***

0.042***

40

40

langnoteng

2.258***

2.207***

42

42

teachexpect

0.260*

44

advdegree

0.401*

45

experience

0.054***

46

values

0.980*

7

ttreats

-0.039

[.377]

tsuccess

-0.379

48

tlisten

0.647

49

tinterest

-0.329

50

tmatheasy

0.803***

51

constant

26.417

26.586***

19.647***

20.499***

18.55

52

53

54

[.846]

55

N

23415

23415

23415

23415

23415

ledgend: * p<0.05; **<0.01; ***<0.001; note: standard errors are clustered at the school level and displayed in parenthesis

Works Cited
Darling-Hammond, Linda. “Teacher quality and student achievement.”Education policy analysis archives 8 (2000): 1.

Coleman, James S. Equality of Educational Opportunity. [Washington]: U.S. Dept. of Health, Education, and Welfare, Office of Education; [for Sale by the Superintendent of Documents, U.S. Govt. Print. Off., 1966.

Klecker, Beverly M. “The Relationship between Teachers’ Years-of-Teaching Experience and Students’ Mathematics Achievement.” (2002).

Murnane, Richard J., and Barbara R. Phillips. “What do effective teachers of inner-city children have in common?.” Social Science Research 10.1 (1981): 83-100.

United States. Department of Education. National Center for Education Statistics. EDAT. Web. < http://nces.ed.gov/surveys/hsls09/>.

  1. 233
  2. 297
  3. 298
  4. 299
  5. 297
  6. 425
  7. 422
  8. 418
  9. 479
  10. 458
  11. 456
  12. 125
  13. 137
  14. 160
  15. 159
  16. 211
  17. 204
  18. 207
  19. 207
  20. 037
  21. 038
  22. 038
  23. 187
  24. 192
  25. 192
  26. 308
  27. 316
  28. 233
  29. 232
  30. 188
  31. 195
  32. 328
  33. 329
  34. 227
  35. 228
  36. 209
  37. 211
  38. 228
  39. 222
  40. 007
  41. 007
  42. 259
  43. 259
  44. 121
  45. 181
  46. 013
  47. 422
  48. 478
  49. 349
  50. 223
  51. 244
  52. 108
  53. 140
  54. 716
  55. 844

Finding the Flaws in Claims about School Choice: What Do We Really Know About School Choice and Student Outcomes

Posted on

School choice—a resounding success! Or is it?
Across the nation the popular rhetoric used to describe school choice is glowing. Describing Connecticut’s choice system, newspaper headlines proclaim, “[Choice programs are] a major contributor to closing the achievement gap” 1 and “Students [in school choice programs] are improving each year!” 2. The much talked about full-length documentary, Waiting for Superman, holds charter schools and parent choice up as the last hope for our urban students to succeed. But in reality, many of these assertions are made based on a faulty comparison 3. The current rhetoric used in the public sphere about choice schools and student performance is not accounting for the fallacy of selection bias.

Measuring the achievement impact of choice schools compared to traditional public schools on students is very difficult. The only true comparison would be to take advantage of a parallel universe in which one could compare students who attended a choice school in one universe with the very same students who simultaneously attended a public school. If this technique was possible, many researchers would be out of the job.

This article points out the flaws in many evaluations of choice schools, and highlights several ways to mitigate and improve school choice analysis. Additionally, using a robust data set, I provide original analysis that mitigates some these issues and situate the findings in a broader context.

Selection bias—the problem that plagues all school choice studies
To investigate the effect that school choice has on student outcomes, researchers leverage statistical tools to try to make the most accurate comparison. The issue we are most concerned about when trying to make this comparison is selection bias. Selection bias occurs when the population of students you are looking at is not random but is self selected. The thing about self selection, is that it may be due to some factor that actually matters in the whole equation. In the school choice debate, we worry about selection bias when the families who chose to apply and attend a charter school are even slightly different that the families who just end up keeping their kids in traditional public schools. The problem arises when we try to compare these two groups. It may be that the difference we observe in test scores is really due to the dissimilarity in the family characteristics rather than in the effectiveness of choice or traditional public schools. Herein lies the challenge: How do we make a true comparison of student outcomes between choice schools and traditional public schools?

Virtual twin method—one way to minimize the impact of selection bias
The CREDO team at Stanford University came up with a method called “virtual twin” to try to make better comparisons. Essentially, they use measurable student characteristics and previous achievement to match students in charter schools with students who attend public school in their same school district. For example, CREDO compares two students with similar prior test scores both coming from low income and high parental education families, but one student now attends a charter school and the other attends a traditional public school. They do this with many pairs of students or “twins” to curb selection bias and make a better comparison between the two school types. Using this methodology in 2009, the CREDO team found that only 17% of charter schools outperformed traditional public schools, while 46% did worse, and 37% had no statistical difference. 4 They repeated this study on a slightly larger sample of students in 2013 and found that charter schools on average performed slightly better than in the 2009 study 5, but that at the end of the day, an average charter school is just average.

The virtual twin methodology is not perfect, because not all factors can be matched. There still may be some unobservable differences between students who attend charter schools compared with their public school peers. For example, a family that takes the time and effort to apply to a charter school, might be more involved in their student’s education than a family that just sends their student to the neighborhood school, and that might be why we see choice school students performing better than the traditional public school students.  In other words, the result may be driven by the unobservable characteristics of the students who attend charter schools, rather than the actual effect of the charter school themselves.

Randomization—another way to address the problem of selection bias
Using another method to mitigate the issue of selection bias, some researchers take advantage of the randomization inherent in a charter school lottery. When charter schools receive more applications that spots available they are required to hold a randomized “lottery” to determine which students receive a spot. In a large study of charter schools, Gleason et al, (2010), compared the achievement of students who won charter lotteries and attended charter schools to students who lost charter lotteries and attended traditional public schools. Since the lotteries are random, we assume on average, there is no difference between the people who won and lost the lottery 6.

In terms of methodology, randomized trials are the closest one can get to a perfect comparison. The methodology helps mitigate the selection issue present in the CREDO study, since the student population they are comparing, the winners and losers, both have the unobservable characteristics that lead to a family applying to a charter school. The Gleason study found, on average, that there was no statistically significant impact of charter schools on student achievement. Similar to the CREDO studies, Gleason reports positive outcomes for students with low-SES backgrounds. But even this study with randomized design has it’s limitations. For example, only schools that receive more applications than spots use a lottery, therefore the charter schools analyzed in this study were charter schools that received lots of applications, potentially meaning they were on average better charter schools.

Big Data Analysis—a third method to account for selection bias
I set out to find a different method to add to the current understanding of the effect school choice has on student outcomes, taking into account the main issues involved in investigating student outcomes, including selection bias and the unobserved factors that come with it. Increasingly, researchers are collecting data about students over time, in what are referred to as longitudinal studies. These studies often involve capturing data about large numbers of students via surveys, resulting in large data sets. I decided to use one such data set, from the High School Longitudinal Study of 2009.  By using a variety of variables focusing on student achievement, family background, and school characteristics from the High School Longitudinal Study of 2009 (HSLS:09) I wanted to see if I could shed light on the school choice debate.

The High School Longitudinal Study of 2009 was conducted by the National Center for Education Statistics. The data set is comprised of nearly 24,000 9th graders selected randomly from 944 schools. Students, parents, teachers, administrators, and counselors are all surveyed to collect a wide variety of data on both the students and their learning environment. Called multi-level surveying, this data set, in concert with students test scores provided a rich data set for analysis.For an extended explanation of these data, click here.

One of the main issues with using survey data is that it is impossible to account for every potential factor that determines student achievement. In order to isolate the true effect of participating in a school choice program, it’s necessary to hold constant every other potential difference between students. This is obviously an impossible task, especially considering the many unobserved and unmeasurable factors that are present, such as differences in student motivation or innate ability. However, there are analytic and statistical strategies that enable you to control for these differences, that allow you to better isolate the true relationship between school choice and student achievement. I used a variety of these student, parent, teacher, and school controls to try to measure the underlying components that affect a student’s test score.

My Assumptions
I set out, assuming that five factors are most important in determining student achievement.They are: 1) whether or not a student attends a choice school, 2) students’ demographic characteristics, 3) students’ motivation, 4) a student’s parental characteristics, and 5) a students teacher characteristics. If we had the data and could measure all of these underlying factors, we could make a convincing case for the accuracy of our estimates to truly measure the effect of choice on student achievement.

Unfortunately, many of these underlying constructs are unobservable, not measured, or have layers of complexity. To mitigate this issue, I used factors I could measure that get at the underlying construct and are highly correlated with these unmeasured factors. For example, when looking at student motivation, I controlled for whether students think getting good grades is important and whether students think they will graduate from college. The hope is that high student motivation, an unobservable characteristic, will overlap sufficiently with students who think getting good grades is important and expect to graduate from college to serve as a proxy. For results to be reliable, these relationships need to be highly correlated but not necessarily perfectly correlated. This is because when working with a very large number of students, as I was, one can begin to see that on average these factors will account for motivation. The rest of our proxies are displayed in table 2. Click on each underlying construct for a deeper examination of the variables used to measure them. Click here for a full breakdown of the model used in estimation.

Click to read more on how each construct is represented: true achievement, school choice, family background, student motivation, parent characteristics, or teacher characteristics.

The Tricky Bit—How to Account for Selection Bias
Now for the important question, in the context of these data and techniques, how did I compare students in choice schools to students in traditional public school knowing that that difference in decision might be because of some unobservable characteristic obscuring the true comparison between choice students and traditional public school students?

My hypotheses going in to this study is that when first looking at choice schools on student achievement I would see a positive effect because of selection bias; I expected that the students in choice schools would be systematically different from those in traditional public school due to parental factors that affected their selection of a choice program. However, I expected that after explicitly controlling for parental characteristics, and making a much more valid comparison between students in both types of schools, the initial positive result will not persist. My hypothesis is consistent with past studies that support the idea that parents who send their students to choice schools are on average more involved in their students education thus effecting achievement (Gleason et. al, 2010; Ballou et al., 2007; Betts et al., 2006, CREDO, 2009).

To control for this confounding factor, I used a variety of controls to account for a wide variety of parent involvement. I considered whether or not a parent attends any meeting at the school, a parent teacher organization meeting, or a parent teacher conference. I took into account whether a parent volunteers at their child’s school or helps fundraise for the school. I also considered parents’ expectations of how far they think their student will get in school, and whether or not they help their student with their homework. My assumption was that together all of these variables account for and overlap sufficiently with the unobservable characteristics that choice school families have that would affect student achievement. Although these factors do not directly account for the underlying construct I argue that these characteristics would signal and proxy for the unobserved ones.

The strength of this approach is that it addresses the issue that comes in to play with the Virtual Twin methodology—selection bias, and it gets around some of the main issues of randomization including only looking at over-subscribed schools. The weakness of the method I used is needing to rely on my proxy strengths without being able to actually tell if they sufficiently account for selection bias. There are some who say that data analysis is more  art than science. A statistical model is an argument and it is important to question each assumption, while at the same time stepping back to look at what the whole thing has an ability to tell. I argue that the above variables account for enough of the underlying factors of student achievement for our results to be unbiased.

Click here for the descriptive statistics of all the variables used in estimation.

My Findings
Using data from the High School Longitudinal study of 2009 (HSLS 09) and the above methodology, I indeed found that when initially looking at the relationship of participation in a school choice program and student learning, there exists a positive effect for students of low socioeconomic status. This result explains some of the promise and glamour that the idea of school choice receives. However, after using more robust methods and explicitly controlling for the difference in students and families that chose to attend choice programs, the once promising result, disappears.

To arrive at this conclusion I first compared the achievement of students who went to choice schools to that of students who went to traditional public schools while accounting for their race, socioeconomic status and intrinsic motivation. I found that attending a choice school had a positive impact on students from low socioeconomic background. Results based on simple comparisons like this are constantly held in the media as evidence of the positive impact of school choice. To account for the issue of selection bias and the potentially unobserved parent characteristics as the possible reason choice students appear to perform better in my first comparison, I next also accounted for the parent-related variables. As discussed above, these variables are used to account for the potential selection bias introduced because of the differences between the populations at choice schools compared to traditional public schools. I found that after accounting for selection bias, on average, students in choice school perform no better than students in traditional public schools. This result confirms my hypothesis and corroborates other literature indicating that after accounting for selection bias, on the whole choice schools do not outperform traditional public schools. Lastly, when accounting for teacher quality, the results remain the same. Click here to see the full table of regression results.

In summary, looking at the simple relationship between choice schools and student achievement, I found a positive effect of choice schools, consistent with popular claims made in the headlines. However, when accounting for the observed and unobservable differences in data, these once promising results do not persist.

The Limitations
As previously discussed, there are several limitations to this study. First, without random assignment there is no way to be sure that we fully accounted for selection bias. I can make an argument, and I hope that I have, that my methodology accounts for selection bias, but we will never know for sure. Second, beyond selection bias, we don’t know if there are other factors that affect achievement that we are not accounting for that are systematically different between students in choice schools and students in traditional public schools. Researchers call this omitted variable bias, and it is always an issue when working with survey data in particular.  One indicator that this study may sufficiently account for both selection and omitted variable bias, is that its results are consistent with randomized studies on schools choice that also find no relationship between choice and student outcomes 7 8 9.

Additionally, it is worth noting that this study looks at choice schools on average. This does not mean that no choice schools are outperforming traditional public schools. Rather, it means that as a whole the choice school reform movement is not outperforming the status quo of traditional public schools. Further, this paper also does not distinguish between types of school choice. Because of data limitations charter schools, magnet schools, and voucher programs were clumped together.

Click for more technical limitations and solutions to these limitations including, missing values, attrition, and other data issues.

The Implications
With school choice becoming increasingly popular among reforms it is crucial to investigate its actual effect on students. Although there is a large body of existing research, it is important to keep looking for pieces in the solution to bring better educational opportunities to students as policies shift and school systems progress. A single assessment of the choice system alone will not provide enough evidence on it’s own, but using an abundance of data and a range of techniques, we can continue to fill in more and more of the picture.

Next time you read about a school choice success, don’t accept the result outright. Make sure you consider the comparison they are making, and ask yourself: Do you believe these two groups are equivalent? Has the study sufficiently accounted for the unobservable differences between students in choice schools and students in traditional public school?

Notes:

  1. Ken Imperato, Ajit Gopalakrishnan, and Richard Mooney, “Choice Program Data and Emerging Research: Questioning the Common Interpretations of Publicly Reported Indicators of Choice Program Success” (Magnets in a School Choice Arena, Goodwin College, East Hartford CT, December 12, 2013),http://www.goodwin.edu/pdfs/magnetSchools/Kenneth_Imperato.pdf.
  2. De La Torre, Vanessa. “Hartford ‘Sheff’ Students Outperform Those In City Schools,” September 12, 2013. http://articles.courant.com/2013-09-12/community/hc-hartford-sheff-scores-0913-20130912_1_open-choice-sheff-region-hartford-students.
  3. Guggenheim, Davis, Billy Kimball, Lesley Chilcott, Bill Strickland, Geoffrey Canada, Michelle Rhee, Randi Weingarten, et al. 2011. Waiting for “Superman”. Hollywood, Calif: Paramount Home Entertainment.
  4. Center for Research on Education Outcomes (CREDO). 2009. Multiple Choice: Charter School Performance in 16 States. Stanford, CA: CREDO.
  5. Center for Research on Education Outcomes (CREDO). 2013. National charter school study 2013. Stanford, CA: CREDO.
  6. Gleason, Philip, Melissa Clark, Christina Clark Tuttle, and Emily Dwoyer. The Evaluation of Charter School Impacts: Final Report. NCEE 2010-4029. National Center for Education Evaluation and Regional Assistance, 2010. http://eric.ed.gov/?id=ED510573.
  7. Bifulco, Robert, Casey D. Cobb, and Courtney Bell. “Can Interdistrict Choice Boost Student Achievement? The Case of Connecticut’s Interdistrict Magnet School Program.” Educational Evaluation and Policy Analysis 31, no. 4 (December 1, 2009): 323–45. doi:10.3102/0162373709340917.
  8. Betts, Julian R. Does School Choice Work?: Effects on Student Integration and Achievement. Public Policy Instit. of CA, 2006.
  9. Gleason, Philip, Melissa Clark, Christina Clark Tuttle, and Emily Dwoyer. The Evaluation of Charter School Impacts: Final Report. NCEE 2010-4029. National Center for Education Evaluation and Regional Assistance, 2010. http://eric.ed.gov/?id=ED510573.

Charter Schools and The Issue of Scalability: An Unexpected Conclusion

Posted on

A fundamental tension throughout A Smarter Charter and the greater charter debate is should charters be mere laboratories with the sole purpose of improving public education for all, or should charter schools provide an increasing segment of the population an alternative to traditional public schools. The ability for charter schools’ methods to scale up to the rest of public schools is the central consideration in this tension. It may seem like the lack of scalability charter schools face would support proponents of limiting charter growth. However, when you reframe the question, the exact opposite conclusion comes to light, universal charters are the solution to the issue of scalability.

Throughout A Smarter Charter, scaling up proved to be challenging. Numerous teachers at the Cesar Chavez school, in D.C. experienced a significant drop off in support and positive work environment when the charter network expanded from teaching 60 students to nearly 1,500 (25). It was not until they unionized that some of that empowering environment was restored. But even then a lot of the teacher-administrative collaboration is delicate, largely hinging on a “more communicative and responsive administration” that could change at any time (44). Kahlenberg and Potter point out that stand alone schools are more conducive to promoting teacher voice (96). There are ways to mitigate these growing pains, but it is not easy to scale all of these reforms. Co-op teacher models, unions, slim contracts, and teacher voice in particular have trouble scaling up (104,109, 117). This inability to scale is not only evident in the very successful schools in the book, but in the charter school community as a whole. On the whole charter schools are not showing gains in student outcomes (68).

Many of the successful practices at the schools featured in this book are examples of charter schools moving in the right direction. However, we saw these policies lose their edge when applied to larger charter networks. How do we interpret this lack of scalability, does this inability to scale support or refute the idea that charter schools should provide education to an increasing number of students? One response to these findings, would be to curb the number of charter schools allowed as Al Shanker original suggested. Charter schools would be used as innovative labs to inform public education and not as a universal school.

Alternatively, the inability to scale successful methods could be interpreted as the exact reason why a universal charter system is needed. A large network of charter schools or the even larger network of traditional public schools will never be able to support the teacher voice, student integration, small community, and site specific flexibility necessary to best address the needs of the students. Perhaps we can preserve the benefits of the smaller charter school by replicating the model rather than expanding the model. In other words, keeping the charter schools small, each with a democratic participatory governance that are independent, yet associated with other charter schools for support and shared knowledge.

Additional questions for Kahlenberg and Potter:

We think of no excuses schools to be paternalistic (20), however these school rarely practice progressive education and other techniques used in schools that serve the elite. What do we make of this irony?

One of the positive impacts highlighted about integrated schools was the role of the parents (64-65) is the increased parent involvement by middle and high income parents. However, are these parents serving the interest of all students at their child’s school or only the interest of their own child? If so, I could see a situation where middle and high income parents bend policy to benefit their own students, for example advocating for extra funding going towards enrichment rather than extra support that actual harms low income students.

Using portfolios to evaluate teachers and teacher pay is a really interesting proposition. How would this affect the current incentive structure for teacher evaluation, and what unintended consequences would come of this policy shift?

In Hartford, the Sheff case has forced minimum levels of integration in all magnet schools. Schools use weighted lotteries to insure this balance. These weighted lotteries receives pushback from some communities leaders arguing that these lotteries take away spots from Hartford students who would otherwise be going to struggling schools and gives them to suburban students who would be going to a high achieving schools no matter what. By using a weighted lottery we are sacrificing equity for integration? Are we okay with this?