Exercise 1
Representing Groups Thoughtfully
Every decision about which variables to include, exclude, or lump together in your analysis shapes how your results can be interpreted â and how communities may be impacted.
These choices are never neutral: they can reinforce stereotypes, obscure important differences, or even lead to misuses of research findings.


đ Step 1: Read
Read one or two (light) readings
đŹ Step 2: Discuss
Apply discussion questions to a real or hypothetical research project
âď¸ Step 3: Reflect
Use a worksheet to reflect on how this exercise applies to your work, and note key takeaways
Step 1.
Read
đ Read as a group.
Note: web versions include links to additional content.
Readings
Read the following readings as a group. You might take turns reading aloud or spend a few minutes reading quietly.
Step 2.
Discuss
đŹ Discuss a real or imaginary project
Discussion Prompts
Use the prompts below to guide your groupâs conversation.
You can focus on a real research project or make one up for this exercise.
Community Characteristics
- What are the characteristics of the group(s) or community(ies) youâll be studying? Who might be included in your study, even if they are not the main focus?
Consider the following factors:
Disease status.
- People who have a given disease
- People who may have a given disease
- People who have been treated for a given disease
- People who are more susceptible to given disease than others
- People who are not known to have or are not known to be susceptible to a given disease (âhealthyâ or âresilientâ)
Access to healthcare.
- People with access to primary care
- People with access to both primary and speciality care
- People with access only to urgent or emergency care
- People without access to medical care
Financial barriers to accessing healthcare.
- People for whom cost of care does not significantly alter their healthcare decisions
- People for whom cost of care has some impact on their healthcare decisions
- People for whom cost of care is a primary driver of their healthcare decisions
Exploring Demographic Variables
- Which demographic variables or values will you use within your analysis? Please note that âdemographic variablesâ is a broad term. Considering only the âusual suspectsâ—e.g., age, race, ethnicity, gender, sexâmay exclude variables or values of particular importance to your analysis, for example access to healthcare.
- Why are you including them? For each variable, consider the following reasons. Are there any reasons you can think of that are NOT on this list?
- This variable is included in the dataset and leaving it out might weaken my analysis
- A funder, government agency, or other authority has recommended including this variable
- Other researchers have used this variable before
- Iâve used this variable in my own past research
- Itâs considered standard practice in my field to include this variable
- It relates directly to the specific characteristics Iâm studying in this project
- Which demographic variables or values are you EXCLUDING? Again, jot them down for your own reference.
- Why are you excluding them? For each variable/value, consider the following reasons, and any reasons you can think of that are NOT on this list.
- This values/variables isnât relevant to my research question
- The sample size is too small for this variable/value
- One or more of the values for this variable are too small and there isnât another value I can combine it with
- One or more of the variables are too small and there isnât another variable I can combine it with
- The data for this variable/value was collected poorly or is incomplete
Proxies and Stand-Ins
- Are any of your variables acting as proxies for something not in the dataset (e.g., healthcare costs as a stand-in for healthcare needs)?
- If so, why are you making this choice, and what are the risks of doing so?
Lumping and Splitting
Letâs talk about how your study defines groups. These choices â whether groups are âlumpedâ together or âsplitâ apart â can have real impacts on communities. Consider these prompts as you discuss.
- Think about the data you are planning to use. Think: how might this data have intentionally or unintentionally already grouped people together (âlumpedâ) or separated them (âsplitâ)?
- Who has been lumped together or split apart?
- Do you think these choices were intentional or unintentional? Why?
- Why do you think these choices (intentional or unintentional) were made?
- What might this mean for your use of these data? Could it lead to any of the following?:
- Some groups being made less visible?
- Some groups being made invisible altogether?
- Some groups being excluded from the research?
- Some groups being singled out or overused?
- Groups being lumped with people they donât identify with?
- Groups being split apart from people they do identify with?
- Could your research not be generalizable or transferable to some communities?
- Could some communities be unable to use your findings?
- Are you planning to make additional lumping decisions beyond whatâs in the dataset?
- If yes, which values and/or variables, and why? Consider the following reasons.
- Iâm combining these values/variables because the sample size is too small otherwise, and I donât feel that the difference between these values/variables is critical to my analysis
- These groups have been combined before by a funder, government agency, or other authority
- Other researchers have combined these values/variables in the past
- Iâve combined these values/variables in my own past research
- Combining these values/variables is a standard approach recommended by a professional body
- These values/variables share characteristics that are directly relevant to my study
- Combining these groups makes the effect Iâm studying easier to see in my analysis
- The dataset has combined one or more values/variables with others in a way I donât agree with but I am unable to split them because those details are not available to me
- Do any of your demographic variables include an âotherâ category?
- If yes, who ends up in this category, and what does it mean for their visibility in your study?
Impacts on Communities
- How could your inclusion, exclusion, lumping, or âotherâ decisions affect communities?
- Some groups being made less visible?
- Some groups being made invisible altogether?
- Some groups being excluded from the research?
- Some groups being singled out or overused?
- Groups being lumped with people they donât identify with?
- Groups being split apart from people they do identify with?
- Could your research not be generalizable or transferable to some communities?
- Could some communities be unable to use your findings?
Fit for Purpose
- Overall, why do you think your dataset(s) and your choices about variables are the right and responsible way to answer your research question?
- How will you communicate these choices transparently to prevent people from misinterpreting or misusing your research?
You may find this reflection useful to adapt for materials like methods sections or a plain-language statement for community or public audiences.
Step 3.
Reflect
âď¸ Document your takeaways
Note on versions:
Reflection Worksheets
Take a few minutes to reflect on this exercise using the worksheet below. Choose the version that best matches your role â or share one worksheet as a group. Jot down any insights, questions, or takeaways.
Next Steps
Youâve completed this exercise. Great work! đ




