Let’s Abuse Government Data!

Data…the cold hard facts and information that form the cornerstone for modern-day statistical analysis, make possible amazingly advanced leaps in genetic understanding, and help Groupon sell you that fantastic facial/spa day deal that you otherwise wouldn’t have thought twice about. Data is everywhere, and it’s been increasingly creeping into our classrooms.  Not to imply that the “creep” is a bad thing, but the abundance of data (and in many cases over abundance) in the classroom is quickly overwhelming many teachers’ abilities to properly digest, analyze, and synthesize all of the information.

It’s not that educators don’t want to process of the data they collect on their students. It’s just difficult to find time to approach the data with a scientific eye for determining what’s going on in the classroom when you have data from DIBELS, RTI tools, practice assessments, benchmark assessments, state tests, national tests, and a host of other assessments. We seem to be teaching and learning in a world where collecting data for data’s sake is perfectly acceptable without a critical eye for how it’s being used. Which is why I thought it might be fun to show just how easily you can abuse data when you’re completely surrounded by it, and don’t have the opportunity to think straight. Oh, and it makes for a really great lesson for students in a stats class, to point out how data can be manipulated when it’s being used ineffectively.

For example, the United States Department of Education has a rather interesting dashboard providing all sorts of educational data related to the President’s 2020 College Attainment Goal. Various criteria have been measured, and the data shared in detail, as a chart, and with a nice colorful map of the United States to easily compare one state to another. Data is further separated by race and ethnicity, and includes helpful little arrows to visually show either growth, no change, or decline.

To illustrate an abuse of this data we’ll look at 8th grade proficiency rates on the NAEP in mathematics. According to data collected in 2009, you can see the percentage of 8th graders that are proficient in math in each state. The dark green means more than 36.6 percent of all 8th graders in the state are proficient, the light green represents those states where 29 to 36 percent of all 8th graders are proficient, and the tan color indicates states where less than 29 percent of the 8th graders are proficient in math on national standardized testing:

Nothing too shocking yet, other than to note that schools in the Northwest, Northeast, and the Midwest appear to be producing 8th graders that more more proficient in mathematics than schools in the Southeast and Southwest, at least an average at the state level. Here’s where it’s easy for any untrained statistician (a layman if you well) to start cherry picking data and make correlations that probably should never be made given the hundreds of variables affecting student performance in a school district.

The following map displays an interesting piece of data; whether or not a state is using student achievement data as a part of teacher evaluations. At first glance you could very easily surmise that despite using student achievement data as a reflection of teacher performance (a trend likely to continue to spread in the U.S.), only 4 of the states using such a teacher evaluation model are receiving any type of benefit from it. In fact, 4 of the states that reported they’re using student achievement to evaluate teachers are actually among the group of states with the lowest performing 8th graders in mathematics. I don’t know about you, but a 50% success rate is nothing to get very excited about.

But of course, there’s much more going on here, than the rather simplified, and most likely inaccurate, correlation I’ve just made. To start with, all of the data represented in these graphs and found on the website represent only 2 years worth of collection, 2007 and 2009; not too much of a longitudinal look at the data yet. Only 8 states reported that they were using student achievement to impact teacher evaluations in 2009, a number that does not represent a fair amount of all 50 states demographically, culturally, or socioeconomically (over half of the states reporting this evaluation method are in the Southeast, a much different educational and economic landscape than the Northeast or Northwest).

However, one of the most glaring problems with both providing and analyzing data in this manner is that you’re taking a highly complex and intricate portrait of what education is and does in this country, and taking one small little “snapshot” of two single pieces of data and trying to make them fit with one another, whether it’s true or not. And that’s where the abuse comes in, because people do it everyday. News media, educators, politicians, parents, students….we’re so inundated with data in our lives right now, and simply don’t have the time to properly analyze it (given that we know how to effectively). I recently finished a 15 month Master’s program in Education, and I still have trouble taking large sets of data and comparing, contrasting, and making sense of it all. So as human beings, we try to pick out small sets of data that we can make sense of, and then try to make them fit, whether they should or not. Toss in colorful graphics, and our brains are further subdued into a sense of relief. Everyone can understand a picture, right? Just compare the two pictures to see what’s different, taking complex data that’s being affected by hundreds of variables, and turn them into the “find what’s different” puzzles in the Sunday paper.

I’d love to see how other people abuse this data, because most likely it already is being abused by politicians, policy makers, eduwonks, and others. But like everything else about our country right now, transparency is king, and the ever growing mountain of data will continue to grow thanks to websites like these from the Department of Education. Am I wrong to look at the data this way? Perhaps, but I sense that the reality I’ve painted is more likely true than not in many schools and state departments of education right now.

Special Bonus: The person who points out the most ways that I misused and abused this data will earn themselvess a custom made video story problem.


  1. The assumption in your post (which you recognize as questionable) is that only 4 states are benefiting, because only 4 states are not in the bottom third.

    What’s more likely is that the states in the bottom third are willing to try a different evaluation method in an attempt to improve their situation. Well, I think it’s more likely, but ultimately it’s still an assumption.

    Half the time when I see posts and comments with assumptions like this paraded as the ultimate truth, I wish that next to the “Like” button there was some sort of “Well, no, not exactly…” button.

    1. You make a great point, Dave. Considering the push back from a lot of educators on the issue of student achievement affecting teacher evaluation, I would wager your assumption has some merit. Many states most likely are trying to adopt some sort of evaluation before going over to such a radical shift in teacher evaluation.

      And your thought about wishing there was some sort of “well, no, not exactly like” button is one I share, which is why I wrote this post. I wanted to write about the over abundance of data, the lack of large numbers of properly trained educators in dissecting and analyzing the data, and then write something up that many others often make the mistake of doing, especially big media (left and right of center). I was hoping that I was clear that my assumption is not just questionable, but down right dangerous, to illustrate how easily data can be mishandled, and abused.

      I wish you had put your e-mail in when you responded, as I would have loved to have talked further about this with you…it sounds like we have something in common on this issue. Data should never be heralded as the “ultimate truth”, especially data like this, that has dozens or hundreds of other variables to consider besides the ones brought immediately to the surface.

  2. Ya I understand completely what you are saying. Stats can be rearranged and screwed up all you want. Perfect example is when politicians during elections years say, well “X% of americans believe blah blah blah” and they don’t cite the source or the demographic of the survey taken. They just fit it into the debate. Its funny there is no “standard” for what a survey should be.

    Cool site by the way…just discovered it!

Comments are closed.