survey, methods

Demographic data on a survey: Why to collect it and what to do with it

December 06, 2017 | Pieta Blakely

When we're conducting a survey for program evaluation, we have probably all been told to collect some basic demographic data on every survey, but what data and why? The most important thing you can do with your demographic data is to demonstrate that the people who responded to your survey represent – in some basic ways – the people that you want to generalize about.

For example, if you have surveyed students in a school that has half male and half female students, then you’ll want to demonstrate that the students who responded to your survey are roughly half male and half female. If all the respondents were female students then you can not draw conclusions about all students’ experiences. If you are implementing a survey in a neighborhood, you’ll want to ensure that the people who responded to the survey represent the diversity of the neighborhood. That might mean asking about the specific ethnicities and cultures that make up the community in addition to the standard race/ethnic categories used on the census.

The other thing that you might want to do with your demographic data is to present the responses of different groups separately or compare them to each other. For example, do boys and girls feel differently about safety at their school? These choices should be guided by theory, previous research, or a question that you are testing. Is there a good reason to think that boys and girls perceive their safety differently? Are there previous studies or examples where that was an issue? If so, then it’s probably useful to disaggregate those responses.

There are dozens of other pieces of demographic data that you might ask about, for example family income, age, employment status, area of residence. If these are important to understanding whether your sample represents your population or important in the analysis, you should include them in your survey. But be cautious about asking about potentially sensitive information; if you don’t really need it, don’t ask.

There are multiple things to consider about asking for information on gender or sexual orientation. On the one hand, you want to ensure that respondents see themselves and their identities reflected in the form. Offering too few choices, or choices that are not appropriate to your audience will alienate respondents, reduce your response rate, and reduce the quality of your data. On the other hand, if the number of responses in a single category is small, you might want to collapse data for presentation. Similarly, you may want to ask for data in a more detailed way, and then collapse categories to report it to funders or government agencies.

However, on the other hand, do consider how much data you really need and the discomfort you might create for respondents in asking for it. For people who are going to check something other than “male” or “female”, the question might be sensitive or even scary. Giving too many answer choices increases the chances that there would be only one or two responses in a given category, leading to privacy concerns. If your organization has no plans to serve respondents with diverse genders differently or better, there might be no benefit to the respondents in collecting that data. A better way to ask this question might be to ask "which gender do you mostly identify with?" with response choices of: male, female, additional genders, and prefer not to say. Here is an excellent blog post on gender questions.

Recently, researchers have gone to asking demographic data at the end of the survey. This is for two main reasons. First, fatigue. Sometimes, respondents get tired of our questions and stop answering them -- especially if they find the questions intrusive or offensive. Not answering should always be an option they feel comfortable using. I talk about that here. If the demographic questions are less important than other questions, you might want to ask those other questions first, rather than wearing out your respondents before they get to the important stuff.

A second reason is a concern about stereotype threat. This is a phenomenon where being reminded of their membership in a group will cause respondents to answer questions differently than they would have otherwise. This is particularly a concern for cognitive tasks like math problems, and probably less so for other kinds of questions like ones about opinions or experiences. I prefer to be cautious and leave my demographic questions to the end unless they’re crucial for directing respondents to other questions.

In presenting your data, you may want to skip presenting the demographic data altogether. There is a tendency to present every single question on your survey with a bar chart or pie graph. But sometimes, the demographic data doesn’t really tell us anything except how well the sample matched the population that you’re talking about. In your report or presentation, simply mention that the sample was representative of the population and link to the data in an appendix.

Tags: survey, methods