How to Lie with Statistics

Posted on

Different types of graphs, with different scales, can portray the same set of data in many different ways.  These three images show how one can take the same information but use it in different ways in order to convince people there has been significant change, or almost no change at all.

Source: Dougherty, Jack, Jesse Wanzer, and Christina Ramsay. “Sheff V. O’Neill: Weak Desegregation Remedies and Strong Disincentives in Connecticut, 1996-2008.” Papers and Publications (January 1, 2009). http://digitalrepository.trincoll.edu/cssp_papers/3.

 

The graphs that I made below both use the above bar graph as their data, but appear convey very different messages. One shows very minor changes, which would be used if one wanted to show how little progress has been made.  The second one has a very steep curve, used if one wanted to display an extreme change in percentage of minority students in reduced isolation settings.

 

See how this graph has a scale on the Y axis that goes up to 100. This leads to a very horizontal line–one that looks as if almost no change has happend.  Using a bigger scale is one of the ways people can “lie with statistics” in order to prove the point that they want.

 

This graph on the other hand starts at 8 instead of zero, and ends at just 18 on the Y axis. This leads to a very vertical line, so it looks like there has been a very high increase in percentage of students in reduced isolation settings.  One would use a small scale on the Y axis if they want to falsely prove a point that there has been a lot of change.

 

I was surprised how drastic the difference in graphs was just by changing the axis. So, in conclusion, when looking at graphs, one should always check the numbers on the axis and think about the scale the author is using before coming to conclusions about graphs.

 

Statistics: Two Truths and a Lie – Part 1

Posted on

Anyone who has ever played the game Two Truths and a Lie knows that telling another person three things about yourself and deliberately lying about one of them is easy, and fun, to do. The same goes for statistics. Those who have taken a statistics course before, know that you must be cautious of how you interpret any kind of data that is presented to you. Sometimes, the data tells the truth. But, like people, data can also tell a lie. How you represent data can make all the difference.

For this exercise, my classmates and I were each asked to create a table using the data from one of our class readings. We were then instructed to create two very different graphs by only using the data in our table. The purpose of this post is to present my two different graphs and explain how I created them.

Here is the table that I created using Microsoft Excel:

Actual and Legal Process toward Sheff I Goal, 2003-2007 Chart – Data Source: Dougherty et al. “Sheff v O’Neill: Weak Desegregation Remedies,” Figure 5.1, p. 111

 

Next, I inserted the data from the table into a line graph (Line Graph #1). I gave the graph a title and labeled both the x- and y-axis. I also changed the minimum and maximum of the y axis to range from 0-100% , with 100% meaning that all of Hartford’ minority students were attending magnet and Project Choice schools. This range of minimum and maximum values makes the data in Line Graph #1 look like there was minor progress made toward the Sheff I goal.

Line graph of the percentage of Hartford minority students in magnet and Project Choice schools

Line Graph #1 – Minor Progress

 

Finally, I created a second line graph (Line Graph #2). This graph has the same data as Line Graph #1, as well as, the same title and axes. Thus, you would expect that the graphs would be identical. But, as you can see, this is not the case. In Line Graph #2, I changed the minimum and maximum of the y-axis to range from 10-35%. This new, smaller, range of y-axis values makes the data look like there was significant process made toward the Sheff I goal. Making the range between the minimum and maximum y-axis values smaller resulted in zooming in on the line in the graph. When looking at a zoomed in graph, or any graph, it is important to understand what the axes represent. Pay close attention to the numbers and do not be mislead by them.

Line graph 2 of percentage of Hartford minority students in magnet and Project Choice schools

Line Graph #2 – Significant Progress

If we had played Two Truths and a Lie with these statistics, could you have recognized the lie?

 

Source:

Jack Dougherty, Jesse Wanzer ’08, and Christina Ramsay ’09. “Sheff v. O’Neill: Weak Desegregation Remedies and Strong Disincentives in Connecticut, 1996-2008.” In From the Courtroom to the Classroom: The Shifting Landscape of School Desegregation, edited by Claire Smrekar and Ellen Goldring, 103–127. Cambridge, MA: Harvard Education Press, 2009. http://digitalrepository.trincoll.edu/cssp_papers/3/.

Statistics: Two Truths and a Lie – Part 1

Posted on

Anyone who has ever played the game Two Truths and a Lie knows that telling another person three things about yourself and deliberately lying about one of them is easy, and fun, to do. The same goes for statistics. Those who have taken a statistics course before, know that you must be cautious of how you interpret any kind of data that is presented to you. Sometimes, the data tells the truth. But, like people, data can also tell a lie. How you represent data can make all the difference.

For this exercise, my classmates and I were each asked to create a table using the data from one of our class readings. We were then instructed to create two very different graphs by only using the data in our table. The purpose of this post is to present my two different graphs and explain how I created them.

Here is the table that I created using Microsoft Excel:

Actual and Legal Process toward Sheff I Goal, 2003-2007 Chart – Data Source: Dougherty et al. “Sheff v O’Neill: Weak Desegregation Remedies,” Figure 5.1, p. 111

 

Next, I inserted the data from the table into a line graph (Line Graph #1). I gave the graph a title and labeled both the x- and y-axis. I also changed the minimum and maximum of the y axis to range from 0-100% , with 100% meaning that all of Hartford’ minority students were attending magnet and Project Choice schools. This range of minimum and maximum values makes the data in Line Graph #1 look like there was minor progress made toward the Sheff I goal.

Line graph of the percentage of Hartford minority students in magnet and Project Choice schools

Line Graph #1 – Minor Progress

 

Finally, I created a second line graph (Line Graph #2). This graph has the same data as Line Graph #1, as well as, the same title and axes. Thus, you would expect that the graphs would be identical. But, as you can see, this is not the case. In Line Graph #2, I changed the minimum and maximum of the y-axis to range from 10-35%. This new, smaller, range of y-axis values makes the data look like there was significant process made toward the Sheff I goal. Making the range between the minimum and maximum y-axis values smaller resulted in zooming in on the line in the graph. When looking at a zoomed in graph, or any graph, it is important to understand what the axes represent. Pay close attention to the numbers and do not be mislead by them.

Line graph 2 of percentage of Hartford minority students in magnet and Project Choice schools

Line Graph #2 – Significant Progress

If we had played Two Truths and a Lie with these statistics, could you have recognized the lie?

 

Source:

Jack Dougherty, Jesse Wanzer ’08, and Christina Ramsay ’09. “Sheff v. O’Neill: Weak Desegregation Remedies and Strong Disincentives in Connecticut, 1996-2008.” In From the Courtroom to the Classroom: The Shifting Landscape of School Desegregation, edited by Claire Smrekar and Ellen Goldring, 103–127. Cambridge, MA: Harvard Education Press, 2009. http://digitalrepository.trincoll.edu/cssp_papers/3/.

Lying with Statistics

Posted on

This post is about manipulations of statistics in terms of how they are presented. In this case, data was drawn from

Jack Dougherty, Jesse Wanzer ’08, and Christina Ramsay ’09. “Sheff v. O’Neill: Weak Desegregation Remedies and Strong Disincentives in Connecticut, 1996-2008.” In From the Courtroom to the Classroom: The Shifting Landscape of School Desegregation, edited by Claire Smrekar and Ellen Goldring, 103–127. Cambridge, MA: Harvard Education Press, 2009. http://digitalrepository.trincoll.edu/cssp_papers/3/.

Using the same data, the following line charts were created. The data and x-axis were kept constant to show the importance of the y-axis values, specifically the minimum and maximums though the increments of the y-axis are also important.

 

Percent of Hartford minorities in reduced-isolation settings

The first graph lists percentages up to 100% (complete integration) in increments of 10. In this chart, minimal progress is revealed (though the goal of Sheff II is only a modest 30%).

 

 

This graph shows leaps of progress with a deceptive minimum of 10% and maximum of 18% with increments of 2. In this way, the line seems to increase dramatically though the graph above illustrates this isn’t so.

 

Question what is being measured and reported when “consuming” statistics. Do test scores show ‘education’ rate? Is there something to be noted about whether suburban residents are coming into urban schools or vice versa? The starting point, the target goal, and actual possibilities of racial integration need to all be taken account.

Should these charts show the goal result as the maximum or to give context? What increments would be most illustrative of the progress of Sheff I and Sheff II remedies?

The goal of this post to the public is to warn consumers of knowledge to be skeptical when being presented with statistics in the media and in research. Look out for what exactly is being reported (progress from a starting point versus progress from zero) and how certain variables are being defined and presented.

Lying with Statistics

Posted on

This post is about manipulations of statistics in terms of how they are presented. In this case, data was drawn from

Jack Dougherty, Jesse Wanzer ’08, and Christina Ramsay ’09. “Sheff v. O’Neill: Weak Desegregation Remedies and Strong Disincentives in Connecticut, 1996-2008.” In From the Courtroom to the Classroom: The Shifting Landscape of School Desegregation, edited by Claire Smrekar and Ellen Goldring, 103–127. Cambridge, MA: Harvard Education Press, 2009. http://digitalrepository.trincoll.edu/cssp_papers/3/.

Using the same data, the following line charts were created. The data and x-axis were kept constant to show the importance of the y-axis values, specifically the minimum and maximums though the increments of the y-axis are also important.

 

Percent of Hartford minorities in reduced-isolation settings

The first graph lists percentages up to 100% (complete integration) in increments of 10. In this chart, minimal progress is revealed (though the goal of Sheff II is only a modest 30%).

 

 

This graph shows leaps of progress with a deceptive minimum of 10% and maximum of 18% with increments of 2. In this way, the line seems to increase dramatically though the graph above illustrates this isn’t so.

 

Question what is being measured and reported when “consuming” statistics. Do test scores show ‘education’ rate? Is there something to be noted about whether suburban residents are coming into urban schools or vice versa? The starting point, the target goal, and actual possibilities of racial integration need to all be taken account.

Should these charts show the goal result as the maximum or to give context? What increments would be most illustrative of the progress of Sheff I and Sheff II remedies?

The goal of this post to the public is to warn consumers of knowledge to be skeptical when being presented with statistics in the media and in research. Look out for what exactly is being reported (progress from a starting point versus progress from zero) and how certain variables are being defined and presented.