Both scientific and freedom principles do not support it

The Conservative Case Against the Long-Form Census

By Sierra Rayne —— Bio and Archives July 15, 2013

Comments | Print This | Subscribe | Email Us

The mandatory long-form census is -- and needs to continue to be -- a cause for action among libertarians and conservatives, particularly in the United States and Canada. Strong arguments based on the resistance to Big Brother phenomena and social engineering efforts, and the support for other freedom-loving principles, mitigate against a mandatory long-form census. The fight for the long-form census has typically come from the liberals, but even some conservatives are taking the bait. Regardless of who supports or opposes the long-form census and why, the strongest conservative case against these census forms generally rests with their lack of scientific rigor.

Three options exist for the long-form census:

make it mandatory
make it voluntary
scrap it altogether

While I prefer the third option, the mandatory versus voluntary options need to be addressed. We have evidence that income surveys are more accurate when the survey is mandatory versus voluntary. This seems reasonable. Income is a variable that can be verified by auditing, as all citizens know when they complete their annual tax returns. That said, massive inaccuracy still occurs for mandatory surveys with readily verifiable variables such as income, as evidenced by the problems with fraud in our tax collection systems. But the real concerns with the long-form census do not surround basic financial and demographic information such as income, age, sex, etc., that can be readily verified, but involve questions that can never be practically verified.

In Canada, the long-form census (which was eliminated by the governing federal Conservative Party in 2010) included absurd questions such as how much housework you do, how often you interact with others, the condition of your flooring, and so on. The American Community Survey also includes a number of ridiculous inquiries, such as the following questions:

Because of a physical, mental, or emotional condition, does this person have serious difficulty concentrating, remembering, or making decisions?
Does this person have difficulty dressing or bathing?
How many minutes did it usually take this person to get from home to work last week?
About how much do you think this house and lot, apartment, or mobile home (and lot, if owned) would sell for if it were for sale?
In the past 12 months, what was the cost of oil, coal, kerosene, wood, etc., for this house, apartment, or mobile home? (If you have lived here less than 12 months, estimate the cost)

For such situations, one can imagine individuals being annoyed at filling out a long document, and subsequently start to provide nonsense for the non-verifiable and trivial questions. As noted above, income, sex, age, etc., are generally pieces of information most individuals have at their fingertips and can fill in quickly and/or which many individuals see the state as having a value in knowing. Flooring condition, housework details, etc., not only require more thought for many, but also invoke a ‘who cares’ or ‘none of your business’ response that can impact the accuracy of the information provided. A deeper understanding of these issues falls under the disciplines of behavioral economics and psychology, fields many proponents appear to ignore in their attempt to justify the long-form census. We could also get into the problems with subjective category responses and awareness/assessment issues, and even how to define a topic, in various potential long-form census questions. What one person defines as housework may not match another individual’s criteria. Defective plumbing to one person may not be defective to another, and what about individuals that have defective plumbing and don't know it? What about those that do not have defective plumbing but think they do? What one person defines as “difficulty” in concentrating, remembering, making decisions, dressing, or bathing is likely very different than another person’s definition. By comparison, an age of 51 years old is 51 years old to all rational people, and the same applies if your income is $41,154 (good luck trying to convince the taxation agencies that you interpret this as $10,914). Indeed, these types of issues have been widely discussed in the social sciences, as evidenced by the following quote from the text “Surveys in Social Research” by David de Vaus:

“Voluntary participation, however, conflicts with the methodological principle of representative sampling. Given the choice, certain types of people (e.g. those with lower levels of education, from non-English-speaking backgrounds) are more likely than others to decline to participate in surveys and can result in biased samples. However, compulsory participation is not the solution. Although compulsion might minimize bias it will undermine the quality of the responses.”

This is a very real problem in statistics. For verifiable variables, we can effectively sample the population using voluntary and mandatory surveys and then compare the results to fully audited investigations. Such is feasible for details such as income, age, sex, etc. But what about very personal details of people’s lives? How will we ever truly know how much housework a certain segment of the population does in a particular region, or whether they have difficulty dressing or bathing? Without an in-depth monitoring study, we may never know, and such an audit would be prohibitively time consuming and expensive. Imagine what the government audit for your claims about having difficulty dressing and bathing may look like? Do they want a dressing and bathing demonstration and a test to verify my census reply? What penalties will I receive if I lie about having difficulty dressing and bathing on a mandatory long-form census? Consequently, many (i.e., almost all) social scientists fail to address the looming elephant in the room. Much of the data collected by census agencies and their analogs in other government departments is unreliable for the simple reason that a substantial portion of this data is entirely unverifiable. Regardless of whether a survey is mandatory or voluntary, if you ask someone a question to which you cannot reasonably verify the answer, then you have no idea as to the accuracy of the data. In some cases, it would take exceedingly large amounts of time and money to verify/audit the data acquired, and in other cases (e.g., subjective questions), verification/auditing is impossible. Thus, we have absolutely no idea whether our resulting dataset is accurate. People lie (because they ‘just want to’, because they see personal benefits from artificially tilting a survey’s results, etc.), people often just make answers up because they are either too lazy or simply do not want to think about the answer long enough to provide an accurate one, and -- of course -- people often do not know how to accurately characterize the answer to a question. Nobody appears to be rigorously accounting for these issues when using much of the government’s data, or when the government decides to spend taxpayer money to acquire these types of flawed data. It’s a fantasyland of assuming the underlying data is valid, when it may not be.  So, when we ask individuals -- even in a mandatory survey -- what the status of their plumbing is, or how many hours per week they play with their children in a park, or whether they experience food insecurity (whatever that really means in the West), we have no idea whether the dataset we’ve acquired is reliable or not and/or which parts of it are unreliable or not. It’s essentially junk data, and census agencies and other government departments -- well, almost all researchers in the social sciences and humanities -- have been generating it for decades. So when researchers speak of "information-rich surveys" by census agencies, we must not get confused. There is a lot of data, but often little information as the data is unreliable.  Now is also the right time to address this type of comment that proponents of the long-form census often make:

 “Evidence-based policy-making requires just that -- evidence -- standard, reliable metrics whose quantification and legitimacy is widely agreed upon. In their absence, policy-making at all levels and in every sector will be as expensive as it is hopeful, while policy actors are forced to gingerly ‘guess and check’ over time. In the absence of good data, our ability to fully comprehend complex policy issues will grow anecdotal and inconsistent.”

 Guess what? We’ve been making poke-and-pray policies for a long time using census datasets and other surveys. Why? Because this so-called evidence is unreliable -- it is most often just heresay, and as such, it should not be admissible in public policy formulation. An absence of data is better than a wealth of bad data. I am reminded by the following quotes that appropriately describe how to approach a real lack of accurate information, and which are lessons to be heeded by long-form census proponents: “Knowledge of non-knowledge is power” by Deaner in Fubar 2 and Donald Rumsfeld’s “Unknown unknowns” statement. When knowledge limitations are ignored, we end up with the following claims that probably also apply to studies founded upon much of the problematic long-form census data: “60% of the time it works all the time” by Brian Fantana in Anchorman: The Legend of Ron Burgundy. Thus, among all these calls for evidence-based policy making and how conservative administrations are purportedly anti-science, one actually finds the shoe is on the other foot, and that many of the so-called pro-science individuals are actually anti-science and/or pro-junk science. It is, in fact, the Death of Evidence (or, more accurately, The Proliferation of Junk Evidence) that we are finally climbing out of in some nations. Long-form census proponents are correct in stating that “[i]n the absence of good data, our ability to fully comprehend complex policy issues will grow anecdotal and inconsistent.” That has been the situation for some time (i.e., we’ve had bad data that many have claimed as good data), and it undercuts the proponents’ claims regarding the necessity of much of the long-form census data. Policy actors have always had to “gingerly ‘guess and check’ over time,” and unless we develop and implement mass-scale mind-reading devices and/or a Big Brother-esque state that is all-knowledgeable at a proven factual level, this will likely always be the case. What’s truly ridiculous is that we’ve been generating and using this junk data for so long. When coupled with its threats to privacy and liberty, hopefully the bad science behind much of the long-form census will help us put a stake through the heart of this government mandated nonsense.

Sierra Rayne -- Bio and Archives | Comments

Sierra Rayne holds a Ph.D. in Chemistry and writes regularly on environment, energy, and national security topics. He can be found on Twitter at @srayne_ca