Complete Document - Research Tools

0. Executive Summary

Written by Bret Barrowman, Senior Specialist for Research and Evaluation, Evidence and Learning Practice at the International Republican Institute

Effective democracy, human rights, and governance programming requires practitioners to accurately assess underlying causes of information disorders and to evaluate the effectiveness of interventions to treat them. Research serves these goals at several points in the DRG program cycle: problem and context analysis, targeting, design and content development, monitoring, adaptation, and evaluation.

Goals of Research

Applying research in the DRG program cycle supports programs by fulfilling the scientific goals of description, explanation, and prediction. Description identifies characteristics of research subjects and general patterns or relationships. Explanation identifies cause and effect relationships. Prediction forecasts what might happen in the future.

Research for Context Analysis and Design

Effective DRG programs to counter disinformation require the identification of a specific problem or set of problems in the information environment in a particular context. Key methods include landscape analysis, stakeholder analysis, political economy analysis, and the use of surveys or interviews to identify potential beneficiaries or particularly salient themes within a specific context.

Sample general research questions:

What are the main drivers of disinformation in this context?
What are the incentives for key actors to perpetuate or mitigate disinformation in this context?
Through which medium is disinformation likely to have the greatest impact in this context?
What evidence suggests our proposed activity(ies) will mitigate the problem?
Which groups are the primary targets or consumers of disinformation in this context?
Which key issues or social cleavages are most likely to be subjects of disinformation in this context?

Implementation Research

There are several research and measurement approaches available for practitioners to monitor activities related to information and disinformation, both for program accountability functions and for adaptation to changing conditions. Key methods include digital and analog media audience metrics, measurement of knowledge, attitudes, or beliefs with surveys or focus groups, media engagement metrics, network analysis, and A/B tests. Key research questions include:

How many people are engaging in program activities or interventions?
What demographic, behavioral, or geographic groups are engaging in program activities? Is the intervention reaching its intended beneficiaries?
How are participants, beneficiaries, or audiences reacting to program activities or materials? How do these reactions differ across subgroups, and specifically marginalized groups?
Is one mode or message more effective than another in causing audience to engage information and/or share it with others? How does information uptake and sharing differ across subgroups? What are barriers to information or program uptake among marginalized groups?
What framing of content is most likely to reduce consumption of disinformation, or increase consumption of reliable information? For example, is a fact-checking message more likely to cause consumers to update their beliefs in the direction of truth, or does it cause retrenchment in belief in the original disinformation? Does this effect vary across subgroups?

Evaluation Research

DRG program and impact evaluation can identify and describe key results, assess or improve the quality of program implementation, identify lessons that might improve the implementation of similar programs, or attribute changes in key outcome to a program intervention. Key methods include randomized evaluations and quasi- or non-experimental evaluations, including pre/post designs, difference-in-differences, statistical matching, comparative case studies, process tracing, and regression analysis. Key research questions include:

Are there observable outcomes associated with the program?
Does a program or activity cause a result of interest? For example, did a media literacy program increase the capacity of participants to distinguish between true news and false news? Does a program cause unintended outcomes?
What is the size of the effect (i.e., impact) of an activity on an outcome of interest?
What is the direction of the effect of an activity on an outcome of interest? For example, did a fact checking program decrease confidence in false news reports, or did it cause increased acceptance of those reports through backlash?

Recommendations

Specific research questions should drive the selection of research designs and data collection methods. Committing to a specific design or data collection method will limit the questions the researcher is able to answer.
Use a pilot-test-scale model for program activities or content. Using one or more of these research approaches, workshop interventions on small groups of respondents, and use pilot data to refine promising approaches before deploying to a larger set of beneficiaries.
Protect personally identifiable information (PII). All the data collection methods described in this section can collect information characteristics, attitudes, beliefs, and willingness to engage in political action. Regardless of the method, researchers should make every attempt to secure informed consent to participate in research and should take care to secure and de-identify personal data.
Consider partnerships with research organizations, university labs, or individual academic researchers, who may have a comparative advantage in designing and implementing complex research designs, and who may have an interest in studying the effects of counter-disinformation programs.

1. Overview - Research Tools

Effective democracy, human rights, and governance (DRG) programming to respond to disinformation requires practitioners to make accurate inferences about the underlying causes of information disorders and about the effects of their interventions. Programs to counter disinformation often rely on a research component to identify problems, to identify potential targets or beneficiaries of an intervention, to develop and adapt program content, to monitor implementation, and to evaluate results. This chapter will survey a broad menu of research tools and approaches for understanding disinformation and potential responses, with the goal of supporting DRG practitioners in designing, implementing, and evaluating programs based on the best available data and evidence.

The sections that follow distinguish broadly between research approaches or designs and data collection methods.

Highlight

For the purposes of this guide, a research approach or research design refers to a method or set of methods that allow researchers or practitioners to make valid inferences about disinformation or programmatic responses. In other words, a research design is a method through which one can confidently and accurately answer specific research questions. On the other hand, data collection describes the ways in which researchers and practitioners collect the information needed to answer those research questions. For example, key informant interviews (KIIs) or in-depth interviews (IDIs) are data collection methods that may be used within several research designs.

To support DRG practitioners in developing evidence-based programs to counter disinformation, this chapter is structured according to stages in the program cycle – design, implementation, and evaluation. It provides examples of research approaches that can help answer questions for specific decisions at each stage. As a final note, the examples provided are suggestive, not exhaustive. Useful and interesting research and data collection methods, especially on information and disinformation, require thought, planning, and creativity. To develop a research approach that is most useful for a program, consider consulting or partnering early with internal experts including applied researchers and evaluators or external experts through one of many academic institutions that specialize in research on democracy and governance interventions.

Research Networks

EGAP: Evidence in Governance and Politics (EGAP) is a research, evaluation, and learning network with worldwide reach that promotes rigorous knowledge accumulation, innovation, and evidence-based policy in various governance domains, including accountability, political participation, mitigation of societal conflict, and reducing inequality. It does so by fostering academic-practitioner collaborations, developing tools and methods for analytical rigor, and training academics and practitioners alike, with an intensive focus in the Global South. Results from research are shared with policy makers and development agencies through regular policy fora, thematic and plenary meetings, academic practitioner events, and policy briefs.

J-PAL: The Abdul Latif Jameel Poverty Action Lab (J-PAL) is a global research center working to reduce poverty by ensuring that policy is informed by scientific evidence. Anchored by a network of 227 affiliated professors at universities around the world, J-PAL conducts randomized impact evaluations to answer critical questions in the fight against poverty. J-PAL translates research into action, promoting a culture of evidence-informed policymaking around the world. Their policy analysis and outreach help governments, NGOs, donors, and the private sector apply evidence from randomized evaluations to their work and contributes to public discourse around some of the most pressing questions in social policy and international development.

IPA: Innovations for Poverty Action (IPA) is a research and policy nonprofit that discovers and promotes effective solutions to global poverty problems. IPA brings together researchers and decision-makers to design, rigorously evaluate, and refine these solutions and their applications, ensuring that the evidence created is used to improve the lives of the world’s poor.

Political Violence FieldLab: The Political Violence FieldLab provides a home for basic and applied research on the causes and effects of political violence. The FieldLab provides students the opportunity to work on cutting-edge and policy-relevant questions in the study of political violence. Their projects involve close collaboration with government agencies and non-government organizations to evaluate the effects and effectiveness of interventions in contemporary conflict settings.

MIT GovLab: GovLab collaborates with civil society, funders, and governments on research that builds and tests theories about how innovative programs and interventions affect political behavior and make governments more accountable to citizens. They develop and test hypotheses about accountability and citizen engagement that contribute to theoretical knowledge and help practitioners learn in real time. Through integrated and sustained collaborations, GovLab works together with practitioners at every stage of the research, from theory building to theory testing.

DevLab@Duke: The DevLab@Duke is an applied learning environment that focuses on connecting social scientists at Duke who work in international development with the community of development practitioners to create rigorous programming, collect monitoring and evaluation data, and conduct impact evaluations of development projects. In addressing these goals, they bring together scholars and students attuned to the research frontier and with advanced capabilities in experimental and quasi-experimental impact evaluation designs, survey design and other data collection tools, and data analytics, including impact evaluation econometrics, web scraping and geospatial analysis.

Center for Effective Global Action (CEGA): CEGA is a hub for research on global development. Headquartered at the University of California, Berkeley, their large, interdisciplinary network–including a growing number of scholars from low and middle-income countries–identifies and tests innovations designed to reduce poverty and promote development. CEGA researchers use rigorous evaluations, tools from data science, and new measurement technologies to assess the impacts of large-scale social and economic development programs.

Citizens and Technology Lab: Citizens and Technology Lab does citizen science for the internet. They seek to enable anyone to engage critically with the tech tools and platforms they use, ask questions, and get answers. Working hand-in-hand with diverse communities and organizations around the world, they identify issues of shared concern (“effects”) related to digital discourse, digital rights and consumer protection. Their research methods can discover if a proposed effect is really happening, uncover the causes behind a systemic issue, and test ideas for creating change.

Stanford Internet Observatory: The Stanford Internet Observatory is a cross-disciplinary program of research, teaching, and policy engagement for the study of abuse in current information technologies, with a focus on social media. The Observatory was created to learn about the abuse of the internet in real time, to develop a novel curriculum on trust and safety that is a first in computer science, and to translate research discoveries into training and policy innovations for the public good.

Goals of Research

Description, Explanation, or Prediction? Applied research in the DRG program cycle can support programs by fulfilling one or more of the following scientific goals.

Description: Descriptive research aims to identify characteristics of research subjects at different levels of analysis (e.g., individual, group, organization, country, etc.). Descriptive research classifies or categorizes subjects or identifies general patterns or relationships. Examples of descriptive research in countering disinformation programs might include developing descriptive statistics in polling or survey data to identify key target groups, or analysis to identify key themes in media content.

Explanation: Explanatory research aims to identify cause and effect relationships; it helps answer “why?” questions. It establishes causation through sequencing (as causes must precede their effects) and/or eliminating competing explanations through comparisons. This category may also include evaluation research in the program cycle, to the extent an evaluation attempts to determine the “impact” of a program on an outcome of interest (i.e., whether a program causes a result), or to determine which of several potential program approaches is most effective.

Highlight

Predictive Research in DRG Programming

Several USAID-funded initiatives use predictive research to help DRG practitioners better anticipate and respond to changes in political context. For example, the CEPPS Democratic Space Barometer forecasts democratic opening and closing over a two -year window. The Internews-led INSPIRES Consortium uses media scraping and machine learning to forecast closing civic space on a monthly basis.

Prediction: Predictive research uses descriptive or explanatory methods to forecast what might happen in the future. At a basic level, predictive research in the DRG program cycle might involve using findings from a program evaluation to adapt approaches to the next cycle or to another context. More systematic predictive research uses qualitative or quantitative methods to assign specific probabilities to events over a designated time, as in a weather forecast.

Data sources and collection methods for Disinformation Research include Key Informant Interviews (KII), Focus Groups, Public Opinion Polls, Surveys, Audience Metrics (analog and digital), Web and Social Media Scraping, Administrative Data analysis (data collected and stored as part of the operations of organizations like governments, nonprofits, or firms). There are other methods but these are some key ones that will be explored further in this text.

2. Research for Counter-Disinformation Program Design

Practitioners must make several key decisions in the counter-disinformation program design phase. Those decisions include identifying a specific set of problems the program will address, developing a logic through which the program will address that problem, selecting between alternative activities, and deciding who will be the primary targets or beneficiaries of those activities.

Highlight

Tool spotlight: Hewlett Foundation Literature Revie

“The Hewlett Foundation commissioned this report to provide an overview of the current state of the literature on the relationship between social media; political polarization; and political “disinformation,” a term used to encompass a wide range of types of information about politics found online, including “fake news,” rumors, deliberately factually incorrect information, inadvertently factually incorrect information, politically slanted information, and “hyperpartisan” news.

The review of the literature is provided in six separate sections, each of which can be read individually but that cumulatively are intended to provide an overview of what is known—and unknown—about the relationship between social media, political polarization, and disinformation.The report concludes by identifying key research gaps in our understanding of these phenomena and the data that are needed to address them.”

Context Analysis and Problem Statements

Effective DRG programs to counter disinformation require the identification of a specific problem or set of problems in the information environment in a particular context.

DRG practitioners rely on several research methods to identify priority issues, context-specific drivers of information disorders, perpetrators and targets of disinformation, and incentives to perpetuate or mitigate disinformation. Landscape and stakeholder analyses are approaches to answer key descriptive research questions about the information environment, including identifying important modes of communication, key media outlets, perpetrators and target audiences for disinformation, and key political issues or personalities that might be the subjects of disinformation. Of note, women and members of other marginalized groups have been victims of political and sexualized disinformation, online hate, and harassment. As such, DRG practitioners should also account for uniquely targeted disinformation aimed at marginalized populations globally by conducting qualitative, quantitative, and gender sensitive, inclusive research in order to understand these important dynamics.

Highlight

Sample general research questions:

What are the main drivers of disinformation in this context?
What are the incentives for key actors to perpetuate or mitigate disinformation in this context?
Through which medium is disinformation likely to have the greatest impact in this context?
What evidence suggests our proposed activity(ies) will mitigate the problem?
What groups are the primary targets or consumers of disinformation in this context?
What key issues or social cleavages are most likely to be subjects of disinformation in this context?

These methods may also be explanatory, inasmuch as they identify key causes or drivers of specific information disorders.

As an exploratory option, key data collection methods often include key informant interviews (KII) with respondents identified through convenience or snowball sampling. Surveys and public opinion polls can also be valuable tools for understanding the media and information landscape. Survey questionnaire items on the media landscape can inform programming by identifying how most people get news on social or political events, what outlets are most popular among specific demographic or geographic groups, or which social or political issues are particularly polarizing. Respondents for surveys or polls, if possible, should be selected via a method of sampling that eliminates potential selection biases to ensure that responses are representative of a larger population of interest. Landscape and stakeholder analyses may also rely on desk research on primary and secondary sources, such as state administrative data (e.g. census data, media ownership records, etc.), journalistic sources like news or investigative reports, academic research, or program documents from previous or ongoing programs.

Applied Political Economy Analysis (PEA) is a contextual research approach that focuses on identifying the incentives and constraints that shape the decisions of key actors in an information environment. This approach goes beyond technical solutions to information disorders to analyze why and how key actors might perpetuate or mitigate disinformation, and subsequently, how these social, political, or cultural factors may affect the implementation, uptake, or impact of programmatic responses. Like other context analysis approaches, PEA relies on both existing research gathered and analyzed through desk review and data collection of experiences, beliefs, and perceptions of key actors.

3. Research for Counter-Disinformation Program Implementation

There are several research and measurement tools available to assist practitioners in monitoring of activities related to information and disinformation. At a basic level, these tools support program and monitoring, evaluation, and learning (MEL) staff in performing an accountability function. However, these research tools also play an important role in adapting programming to changing conditions. Beyond answering questions about whether and to what extent program activities are engaging their intended beneficiaries, these research tools can help practitioners identify how well activities or interventions are performing so that implementers can iterate, as in an adaptive management or Collaborating, Learning, and Adapting (CLA) framework.

Program Monitoring

(assess implementation, if content is reaching desired targets, if targets are engaging content)

Key Research Questions:

How many people are engaging in program activities or interventions?
What demographic, behavioral, or geographic groups are engaging in program activities? Is the intervention reaching its intended beneficiaries?
How are participants, beneficiaries, or audiences reacting to program activities or materials?
How does engagement or reaction vary across activity types?

Several tools are available to assist DRG practitioners in monitoring the reach of program activities and the degree to which audiences and intended beneficiaries are engaging program content. These tools differ according to the media through which information and disinformation, as well as counter-programming, are distributed. For analog media outlets like television and radio, audience metrics, including size, demographic composition, and geographic reach may be available through the outlets themselves or through state administrative records. The usefulness and detail of this information depends on the capacity of the outlets to collect this information and their willingness to share it publicly. Local marketing or advertising firms may also be good sources of audience information. In some cases, the reach of television and/or radio may be modeled using information on the broadcast infrastructure.

Digital platforms provide a more accessible suite of metrics. Social media platforms like Twitter, Facebook, and YouTube have built in analytical tools that allow even casual users to monitor post views engagements (including “likes,” shares, and comments). Depending on the platform Application Programming Interface (API) and terms of service, more sophisticated analytical tools may be available. For example, Twitter’s API allows users to import large volumes of both metadata and tweet content, enabling users to monitor relationships between accounts and conduct content or sentiment analysis around specific topics. Google Analytics provides a suite of tools for measuring consumer engagement with advertising material, including behavior on destination websites. For example, these tools can help practitioners understand how audiences, having reached a resource or website by clicking on digital content (e.g. links embedded in tweets, Facebook posts, or YouTube video) are spending time on the destination resources and what resources they are viewing, downloading, or otherwise engaging. Tracking click-throughs provides potential measures of destination behavior, not just beliefs or attitudes.

Workshopping Content: Pilot-Test-Scale

Determining the content of programmatic activities is a key decision point in any program cycle. With respect to counter-disinformation programs, implementers should consider how the messenger, mode, and content of an intervention is likely to influence uptake and engagement by target groups with that content, and whether the material is likely to change beliefs or behavior. With this in mind, workshopping and testing counter-disinformation content throughout the implementation program phase can help implementers identify which programmatic approaches are working, as well as how and whether to adapt content in response to changing conditions.

Key Research Questions:

What modes or messengers are most likely to increase content uptake in this context? For example, is one approach more effective than another in causing the interpreters to engage information and/or share it with others?
What framing of content is most likely to reduce consumption of disinformation, or increase consumption of true information in this context? For example, is a fact-checking message more likely to cause consumers to update their beliefs in the direction of truth, or does it cause retrenchment in belief in the original disinformation?

Several data collection methods allow DRG practitioners to workshop the content of interventions with small numbers of potential beneficiaries before scaling activities to larger audiences. Focus groups (scientifically sampled, structured, small group discussions) are used regularly both in market research and DRG programs to elicit in-depth reactions to test products. This format allows researchers to observe spontaneous reactions to prompts and probe respondents for more information, as opposed to surveys, which may be more broadly representative, but rely on respondents selecting uniform and predetermined response items that do not capture as much nuance. Focus groups are useful for collecting initial impressions about a range of alternatives for potential program content before scaling activities to a broader audience.

A/B tests are a more rigorous method for determining what variations in content or activities are most likely to achieve desired results, especially when alternatives are similar and differences between them are likely to be small. A/B tests are a form of randomized evaluation in which a researcher randomly assigns members of a pool of research participants to receive different versions of content. For example, product marketing emails or campaign fundraising solicitations might randomly assign a pool of email addresses to receive the same content under one of several varying email subjects. Researchers then measure differences between each of these experimental groups on the same outcomes, which for digital content often includes engagement rates, click-throughs, likes, shares, and/or comments.

Highlight

Mode: The mechanisms through which programmatic content is delivered (e.g. in person, written materials, television, radio, social media, email, SMS, etc.)

Highlight

Because participants are randomly assigned to receive different variations, the researcher can confidently conclude any differences over these outcomes can be attributed to the content variation.

Social media platforms have used A/B testing to optimize platform responses to misinformation. In other cases, researchers or technology companies themselves have experimented with variations of political content labels to determine whether these tags affect audience engagement. Similarly, DRG programs might use A/B testing to optimize digital content on disinformation programs to explore, for instance, how different framings or endorsers of fact-checking messages affect audience beliefs.

Highlight

Tools Spotlight: Content and Message Testing Tools

Facebook: “A/B testing lets you change variables, such as your ad creative, audience, or placement to determine which strategy performs best and improve future campaigns. For example, you might hypothesize that a custom audience strategy will outperform an interest-based audience strategy for your business. An A/B test lets you quickly compare both strategies to see which one performs best.”

RIWI: “Respondents are randomly assigned to a treatment or control group to determine the impact of different concepts, videos, ads or phrases. All groups will see identical initial questions, followed by treatment group(s) receiving a developed message. After the treatment, all respondents will be asked questions to determine the resonance and engagement of the message or to measure behavioral changes (assessed post-treatment) between groups.”

GeoPoll: “GeoPoll works with leading global brands to test new concepts through video and picture surveys and mobile-based focus groups. Using GeoPoll’s research capabilities and large panel of respondents, brands can reach their target audience and gather much-needed data on what messaging is most effective, how new products should be marketed, how consumers will react to new products, and more.”

Mailchimp: “A/B testing campaigns test different versions of a single email to see how small changes can have an impact on your results. Choose what you want to test, like the subject line or content, and compare results to find out what works and what doesn't work for your audience.”

Dummy text

4. Evaluative Research for Counter-Disinformation Programs

Evaluation of DRG programs can identify and describe key results, assess or improve the quality of program implementation, identify lessons that might improve the implementation of similar programs, or attribute changes in key outcomes to a program intervention. This section generally focuses on the last type of evaluation– impact evaluation, or determining the extent to which a program contributed to changes in outcomes of interest.

Attributing observed results to programs is perhaps the most difficult research challenge in the DRG program cycle. However, there are several evaluation research designs that can help DRG practitioners determine whether programs have an effect on an outcome of interest, whether programs cause unintended outcomes, which of several alternatives is more likely to have had an effect, whether that effect is positive or negative, and how large that effect might be. Often, these methods can be used within the program cycle to optimize activities, especially within a CLA, adaptive management, or pilot-test-scale framework.

Programs to counter disinformation can take many forms with many possible intended results, ranging from small-scale trainings of journalists or public officials, to broader media literacy campaigns, to mass communications such as fact-checking or rating media outlets. There is no one-size-fits-all evaluation research approach that will work for every disinformation intervention. DRG program designers and implementers should consider consulting with internal staff and applied researchers, external evaluators, or academic researchers to develop an evaluation approach that answers research questions of interest to the program, accounting for practical constraints in time, labor, budget, scale, and M&E capacity.

Key Research Questions:

Does a program or activity cause a measurable change in an outcome of interest? For example, did a media literacy program increase the capacity of participants to distinguish between true news and false news? Does a program cause unintended outcomes?
What is the size of the effect or impact of an activity on an outcome of interest?
What is the direction of the effect of an activity on an outcome of interest? For example, did a fact checking program decrease confidence in false news reports, or did it cause increased acceptance of those reports through backlash?

Randomized or Experimental Approaches

Randomized evaluations (also commonly called randomized controlled trials (RCTs) or field experiments) are often referenced as the gold standard for causal inference – determining whether and how an intervention caused an outcome of interest. Where they are feasible logistically, financially, and ethically, RCTs are the best available method for causal inference because they control for confounding variables – factors other than the intervention that might have caused the observed outcome. RCTs control for these alternative explanations by randomly assigning participants to one or more “treatment” groups (in which they receive a version of the intervention in question) or a “comparison” or “control” group (in which participants receive no intervention or placebo content.) Since participants are assigned randomly to treatment or control, any observed differences in outcomes between those groups can be attributed to the intervention itself. In this way, RCTs can help practitioners and researchers estimate the effectiveness of an intervention.

The costs and logistical commitments for a randomized impact evaluation can be highly variable, depending in large part on the costs of outcome data collection. However, informational interventions, including those intended to counter disinformation, may be particularly amenable to randomized evaluations, as digital tools can support less expensive data collection than face to face methods like interviews or in-person surveys. Regardless of data collection methods, however, randomized evaluations require significant technical expertise and logistical planning, and will not be appropriate for every program, especially those that operate at relatively small scale, since randomized evaluations require large numbers of units of observation in order to identify statistically significant differences. . These evaluation approaches should not be used to evaluate every program. Other impact evaluation methods differ in how they approximate randomization to measure the effect of interventions on observed outcomes, and may be more appropriate for certain program designs.

Highlight

For a comprehensive guide on using randomized evaluations for causal inference in development programming, see J-PAL’s Research Resources.

Highlight

Research Spotlight: Russian Propaganda Hits Its Mark: Experimentally Testing the Impact of Russian Propaganda and Counter-Interventions

In 2020, RAND Corporation researchers, in partnership with IREX’s Learn2Discern program in Ukraine, conducted a randomized control trial to estimate both the impact of a Russian disinformation campaign and of a programmatic response that included content labeling and media literacy interventions. The experiment found that Russian propaganda produced emotional reactions and social media engagement among strong partisans, but that those effects were mitigated by labeling the source of the content, and by showing recipients a short video on media literacy.

Quasi-Experimental and Non-Experimental Approaches

Researchers and evaluators may employ quasi-experimental or non-experimental approaches when random assignment to treatment and control is impractical or unethical. As the name suggests, these research designs attempt to attribute changes in outcomes to interventions by approximating random assignment to treatment and control conditions through comparisons. In most cases, this approximation involves collecting data on a population that did not participate in a program, but which is plausibly similar to program participants in other respects. Perhaps the most familiar of these methods for DRG practitioners is a pre-/post-test design, in which program participants are surveyed or tested on the same set of questions both prior to and following their participation in the program. For example, participants in a media literacy program might take a quiz that asks them to distinguish between true and false news, both before and after their participation in the program. In this case, the pre-test measures the capacity of an approximation of a “control” or “comparison” group, and the post-test measures that capacity in a “treatment” group of participants who have received the program. Any increase in the capacity to distinguish true and false news is attributed to the program. Structured comparative case studies and process-tracing are examples of non-experimental designs that control for confounding factors through across-case comparisons or through comparison within the same case over time.

There are a variety of quasi-experimental and observational research methods available for program impact evaluation. The choice of these tools to evaluate the impact of a program depends on available data (or capacity to collect necessary data) and the assumptions that are required to identify reliable estimates of program impact. This table, reproduced in its entirety with the written consent of the Abdul Latif Jameel Poverty Action Lab, provides a menu of these options with their respective data collection requirements and assumptions.

	Method	Description	What assumptions are required, and how demanding are the assumptions?	Required data
Randomization	Randomized Evaluation/ Randomized Control Trial	Measure the differences in outcomes between randomly assigned program participants and non-participants after the program took effect.	The outcome variable is only affected by program participation itself, not by assignment to participate in the program or by participation in the randomized evaluation itself. Examples for such confounding effects could be information effects, spillovers, or experimenter effects. As with other methods, the sample size needs to be large enough so that the two groups are statistically comparable; the difference being that the sample size is chosen as part of the research design.	Outcome data for randomly assigned participants and non-participants (the treatment and control groups).
Basic non-experimental comparison methods	Pre-Post	Measure the differences in outcomes for program participants before the program and after the program took effect.	There are no other factors (including outside events, a drive to change by the participants themselves, altered economic conditions, etc.) that changed the measured outcome for participants over time besides the program. In stable, static environments and over short time horizons, the assumption might hold, but it is not possible to verify that. Generally, a diff-in-diff or RDD design is preferred (see below).	Data on outcomes of interest for program participants before program start and after the program took effect.
	Simple Difference	Measure the differences in outcomes between program participants after the program took effect and another group who did not participate in the program.	There are no differences in the outcomes of participants and non-participants except for program participation, and both groups were equally likely to enter the program before it started. This is a demanding assumption. Nonparticipants may not fulfill the eligibility criteria, live in a different location, or simply see less value in the program (self-selection). Any such factors may be associated with differences in outcomes independent of program participation. Generally, a diff-in-diff or RDD design is preferred (see below).	Outcome data for program participants as well as another group of nonparticipants after the program took effect.
	Differences in Differences	Measure the differences in outcomes for program participants before and after the program relative to nonparticipants.	Any other factors that may have affected the measured outcome over time are the same for participants and non-participants, so they would have had the same time trajectory absent the program. Over short time horizons and with reasonably similar groups, this assumption may be plausible. A “placebo test” can also compare the time trends in the two groups before the program took place. However, as with “simple difference,” many factors that are associated with program participation may also be associated with outcome changes over time. For example, a person who expects a large improvement in the near future may not join the program (self-selection).	Data on outcomes of interest for program participants as well as another group of nonparticipants before program start and after the program took effect.
More nonexperimental methods	Multivariate Regression/OLS	The “simple difference” approach can be— and in practice almost always is—carried out using multivariate regression. Doing so allows accounting for other observable factors that might also affect the outcome, often called “control variables” or “covariates.” The regression filters out the effects of these covariates and measures differences in outcomes between participants and nonparticipants while holding the effect of the covariates constant.	Besides the effects of the control variables, there are no other differences between participants and non-participants that affect the measured outcome. This means that any unobservable or unmeasured factors that do affect the outcome must be the same for participants and nonparticipants. In addition, the control variables cannot in any way themselves be affected by the program. While the addition of covariates can alleviate some concerns with taking simple differences, limited available data in practice and unobservable factors mean that the method has similar issues as simple difference (e.g., self-selection).	Outcome data for program participants as well as another group of non-participants, as well as “control variables” for both groups.
	Statistical Matching	Exact matching: participants are matched to non-participants who are identical based on “matching variables” to measure differences in outcomes. Propensity score matching uses the control variables to predict a person’s likelihood to participate and uses this predicted likelihood as the matching variable.	Similar to multivariable regression: there are no differences between participants and non-participants with the same matching variables that affect the measured outcome. Unobservable differences are the main concern in exact matching. In propensity score matching, two individuals with the same score may be very different even along observable dimensions. Thus, the assumptions that need to hold in order to draw valid conclusions are quite demanding.	Outcome data for program participants as well as another group of non-participants, as well as “matching variables” for both groups.
	Regression Discontinuity Design (RDD)	In an RDD design, eligibility to participate is determined by a cutoff value in some order or ranking, such as income level. Participants on one side of the cutoff are compared to non-participants on the other side, and the eligibility criterion is included as a control variable (see above).	Any difference between individuals below and above the cutoff (participants and non-participants) vanishes closer and closer to the cutoff point. A carefully considered regression discontinuity design can be effective. The design uses the “random” element that is introduced when two individuals who are similar to each other according to their ordering end up on different sides of the cutoff point. The design accounts for the continual differences between them using control variables. The assumption that these individuals are similar to each other can be tested with observables in the data. However, the design limits the comparability of participants further away from the cutoff.	Outcome data for program participants and non-participants, as well as the “ordering variable” (also called “forcing variable”).
	Instrumental Variables	The design uses an “instrumental variable” that is a predictor for program participation. The method then compares individuals according to their predicted participation, rather than actual participation.	The instrumental variable has no direct effect on the outcome variable. Its only effect is through an individual’s participation in the program. A valid instrumental variable design requires an instrument that has no relationship with the outcome variable. The challenge is that most factors that affect participation in a program for otherwise similar individuals are also in some way directly related to the outcome variable. With more than one instrument, the assumption can be tested.	Outcome data for program participants and non-participants, as well as an “instrumental variable.

Note. From Sautmann, Anja, and Abdul Latif Jameel Poverty Action Lab (J-PAL). 2019. "Impact evaluation methods" J-PAL Publication. Last Modified 2020

Media Monitoring and Content Analysis

Media monitoring and content analysis approaches generally aim to answer research questions about whether, how, or why interventions change audience engagement with information or the nature or quality of the information itself. For example, a fact-checking program might hypothesize that correcting disinformation should result in less audience engagement with outlets for disinformation on social media, as measured by views, likes, shares, or comments.

Several tools are available to help DRG practitioners and researchers identify changes in media content. Content analysis is a qualitative research approach through which researchers can identify key themes in written, audio, or video material, and whether those themes change over time. Similarly, sentiment analysis can help identify the nature of attitudes or beliefs around a theme.

Both content and sentiment analysis can be conducted using human or machine-assisted coding and should be conducted at multiple points in the program cycle in conjunction with other evaluation research designs for project impact evaluation.

Highlight

Research Spotlight: IREX Learn2Discern Quasi-Experimental Impact Evaluation

From October 2015 to March 2016, IREX Implemented Learn2Discern – a large-scale media literacy program in Ukraine in collaboration with The Academy of Ukrainian Press and StopFake. As part of the program, IREX conducted a quasi-experimental impact evaluation using statistical matching to compare program participants to non-participants. The study found that program participants were:

28% more likely to demonstrate sophisticated knowledge of the news media industry
25% more likely to self-report checking multiple news sources
13% more likely to correctly identify and critically analyze a fake news story
4% more likely to express a sense of agency over what news sources they can access.

Donors and partners implementing countering disinformation programs should consider these quasi-experimental methods to evaluate the direction and magnitude of program impacts on outcomes of interest, particularly where random assignment to treatment and control is not feasible.

Highlight

Project Spotlight: IRI Beacon

The Beacon Project’s interventions are informed through rigorous public opinion and media monitoring research, which is used to equip members of the Beacon Network with the tools and data to conduct in-depth analysis of malign narratives and disinformation campaigns. In 2015, the Beacon Project developed >versus<, a media monitoring tool used by in-house experts and media monitors across Europe to track malign narratives and disinformation campaigns in the online media space, analyze their dynamics, and how they are discussed online.

Network Analysis

Network analysis is a method for understanding how and why the structure of relationships between actors affects an outcome of interest. Network analysis is a particularly useful research method for countering disinformation programs because it allows analysts to visualize and understand how information is disseminated through online networks, including social media platforms, discussion boards, and other digital communities. By synthesizing information on the number of actors, the frequency of interactions between actors, the quality or intensity of interactions, and the structure of relationships, network analysis can help researchers and practitioners identify key channels for the propagation of disinformation, the direction of transmission of information or disinformation, clusters denoting distinct informational ecosystems, and whether engagement or amplification is genuine or artificial. In turn, network metrics can help inform the design, content, and targeting of program activities. To the extent analysts can collect network data over time, network analysis can also inform program monitoring and evaluation.

Data collection tools for network analysis depend on the nature of the network generally, and the network platform specifically. Network analysis can be conducted on offline networks where researchers have the capacity to collect data using standard face-to-face, telephone, computer-assisted, or SMS survey techniques. In these cases, researchers have mapped offline community networks using survey instruments that ask respondents to list individuals or organizations that are particularly influential, or whom they might approach for a particular task. Researchers can then map networks by aggregating and coding responses from all community respondents. In this way, researchers might determine which influential individuals in a community might be nodes for the dissemination of information, particularly in contexts where people rely largely on family and friends for news or information.

However, depending on APIs and terms of service, digital platforms such as social media can reduce the costs of network data collection. With dedicated tools, including social network analysis software, researchers can analyze and visualize relationships between users, including content engagement, following relationships, and liking or sharing. These tools can provide practitioners with an understanding of the structure of online networks, and in conjunction with content analysis tools, how network structure interacts with particular kinds of content.

Highlight

Tool Spotlight: IFES/NDI VAWIE-Online Social Media Analysis Tool

Information and Communications Technologies (ICTs) have created new vehicles for violence against women in elections (VAWIE), which are compounded by the anonymity and scale that online media platforms provide. A new tool from the United States Agency for International Development (USAID), International Foundation for Electoral Systems (IFES), and National Democratic Institute (NDI) offers an adaptable method to measure the gendered aspects of online abuse and understand the drivers of this violence. The VAWIE-Online Social Media Analysis Tool can be used by actors from across a range of professions who are concerned by hateful and violent speech online and are motivated to end it.

Highlight

Program/Tool Spotlight: NDI Data Analytics for Social Media Monitoring

NDI seeks to empower partners to leverage technology to strengthen democracy. This means harnessing technology’s potential to promote information integrity and help build inclusive democracies; while also mitigating the harm posed by disinformation, online influence campaigns, hate speech, harassment and violence.

For that reason, NDI developed, “Data Analytics for Social Media Monitoring,” a guide for democracy activists and researchers.

This new guide is designed to help democracy practitioners better understand social media trends, content, data, and networks. By sharing lessons learned and best practices from across our global network, we hope to empower our partners to make democracy work online by helping them:

• Collaborate with local, national, or international partners;

• Understand different methods of data collection;

• Make the best use of mapping and data visualization;

• Analyze the online ecosystem;

• Detect malicious or manipulated content and its source;

• Understand available tools for all aspects of social media monitoring; and

• Know how to respond with data, methods, research, and more through social media.

Highlight

Program Spotlight: Detecting Digital Fingerprints: Tracing Chinese Disinformation in Taiwan.

In June 2019, with the 2018 local elections as a point of reference, Graphika, Institute for the Future’s (IFTF) Digital Intelligence Lab, and the International Republican Institute (IRI) embarked on a research project to comprehensively study the online information environment in the lead up to, during, and in the aftermath of Taiwan’s January 2020 elections, with an awareness of the 2018 precedents and an eye for potential similar incidents throughout this election cycle. Graphika and DigIntel monitored and collected data from Facebook and Twitter, and investigated leads on several other social media platforms, including Instagram, LINE, PTT, and YouTube. IRI supported several Taiwanese organizations who archived and analyzed data from content farms and the island’s most popular social media platforms. The research team visited Taiwan regularly, including during the election, to speak with civil society leaders, academics, journalists, technology companies, government officials, legislators, the Central Election Commission, and political parties. The goal was to understand the online disinformation tactics, vectors, and narratives used during a political event of critical importance to Beijing’s strategic interests. By investing in the organizations investigating and combating Chinese-language disinformation and CCP influence operations, they hoped to increase the capacity of the global disinformation research community to track and expose this emerging threat to information and democratic integrity.

5. Recommendations

Develop research questions first, research designs second, and data collection methods and instruments third. To answer the questions that are most relevant for the context and program, research design and data collection methods should be selected to answer questions that are most important for the program measurement needs. Committing to a research method or data collection method before scoping your research question will limit what can be answered.
In the implementation phase, consider a pilot-test-scale model for program activities. Using one or more of the outlined research approaches, workshop content on small groups of respondents, and use pilot data to refine more promising content before deploying activities to a larger set of beneficiaries.
Protect personally identifiable information (PII). All of the data collection methods described in this section, from interviews, to surveys, to network data and social media analytics, can collect information on intimate and private personal characteristics, including demographic data, attitudes, beliefs, and willingness to engage in political action. Regardless of the selected methodology, researchers should make every attempt to secure informed consent to participate in research, and should take care to secure and de-identify personal data.
Consider partnerships with research organizations, university labs, or individual academic researchers, who may have a comparative advantage in designing and implementing complex research designs, and who may have an interest in studying the effects of counter-disinformation programs.

Research Tools for Understanding Disinformation

Complete Document - Research Tools

Highlight

Research Networks

Highlight

Highlight

Context Analysis and Problem Statements

Highlight

Highlight

Highlight

Highlight

Highlight

Highlight

Highlight

Highlight

Highlight

Highlight

Highlight