External Validity | Types, Threats & Examples

External validity is the extent to which you can generalise the findings of a study to other situations, people, settings, and measures. In other words, can you apply the findings of your study to a broader context?

The aim of scientific research is to produce generalisable knowledge about the real world. Without high external validity, you cannot apply results from the laboratory to other people or the real world.

In qualitative studies, external validity is referred to as transferability.

Types of external validity

There are two main types of external validity: population validity and ecological validity.

Population validity

Population validity refers to whether you can reasonably generalise the findings from your sample to a larger group of people (the population).

Population validity depends on the choice of population and on the extent to which the study sample mirrors that population. Non-probability sampling methods are often used for convenience. With this type of sampling, the generalisability of results is limited to populations that share similar characteristics with the sample.

Example: Low population validity
You want to test the hypothesis that people tend to perceive themselves as more intelligent than others in terms of academic abilities. Your target population is the 10,000 undergraduate students at your university.

You recruit over 200 participants. They are science and engineering students; most of them are British, male, 18–20 years old, and from a high socioeconomic background. In a laboratory setting, you administer a mathematics and science test and then ask them to rate how well they think performed. You find that the average participant believes they are smarter than 66% of their peers.

Can you conclude that most people believe themselves to be much better than others at maths and science?

Here, your sample is not representative of the whole population of students at your university. The findings can only reasonably be generalised to populations that share characteristics with the participants, e.g. university-educated men studying STEM subjects.

For higher population validity, your sample would need to include people with different characteristics (e.g., women, nonbinary people, and students from different fields, countries, and socioeconomic backgrounds).

Samples like this one, from Western, educated, industrialised, rich, and democratic (WEIRD) countries, are used in an estimated 96% of psychology studies, even though they represent only 12% of the world’s population. As outliers in terms of visual perception, moral reasoning, and categorisation (among many other topics), WEIRD samples limit broad population validity in the social sciences.

Ecological validity

Ecological validity refers to whether you can reasonably generalise the findings of a study to other situations and settings in the ‘real world’.

Example: Low ecological validity
You want to test the hypothesis that driving reaction times become slower when people pay attention to others talking.

In a laboratory setting, you set up a simple computer-based task to measure reaction times. Participants are told to imagine themselves driving around the racetrack and double-click the mouse whenever they see an orange cat on the screen. For one round, participants listen to a podcast. In the other round, they do not need to listen to anything.

After assessing the results, you find that reaction times are much slower when listening to the podcast. Can you conclude that driving reaction times are slower when people listen to others talking?

In the example above, it is difficult to generalise the findings to real-life driving conditions. A computer-based task using a mouse does not resemble real-life driving conditions with a steering wheel. Additionally, a static image of an orange cat may not represent common real-life hurdles when driving.

To improve ecological validity in a lab setting, you could use an immersive driving simulator with a steering wheel and foot pedal instead of a computer and mouse. This increases psychological realism by more closely mirroring the experience of driving in the real world.

Alternatively, for higher ecological validity, you could conduct the experiment using a real driving course.

Trade-off between external and internal validity

Internal validity is the extent to which you can be confident that the causal relationship established in your experiment cannot be explained by other factors.

There is an inherent trade-off between external and internal validity; the more applicable you make your study to a broader context, the less you can control extraneous factors in your study.

Example: Internal vs external validity
In the driving reaction times study, you are able to control the conditions of the experiment and ensure that there are no extraneous factors that could explain the outcome. Because the experiment has high internal validity, you can confidently conclude that listening to the podcast causes slower reaction times.

Moving the experiment to a real-life driving course significantly increases external validity at the expense of internal validity. That’s because you risk introducing extraneous and confounding factors (e.g., weather or visibility conditions) that affect the outcome.

Prevent plagiarism, run a check.

Check for plagiarism

Scribbr is an authorized Turnitin partner

Threats to external validity and how to counter them

Threats to external validity are important to recognise and counter in a research design for a robust study.

Example: Research project
A researcher wants to test the hypothesis that people with clinical diagnoses of mental disorders can benefit from practising mindfulness daily in just two months time. They recruit people who have been diagnosed with depression for at least a year, are aged between 20–29, and live locally.

Participants are given a pretest and a post-test measuring how often they experienced anxiety in the past week. During the study, all participants are given an individual mindfulness training and asked to practise mindfulness daily for 15 minutes in the morning.

Since the levels of anxiety decreased between the pre- and post-test, the researcher concludes that all clinical populations can benefit from mindfulness.

Threats to external validity
Threat Meaning Example
Sampling bias The sample is not representative of the population. The sample includes only people with depression. They have characteristics (e.g., negative thought patterns) that may make them very different from other clinical populations, like people with personality disorders or schizophrenia.
History An unrelated event influences the outcomes. Right before the pretest, a natural disaster takes place in a neighbouring state. As a result, pretest anxiety scores are higher than they might be otherwise.
Experimenter effect The characteristics or behaviours of the experimenter(s) unintentionally influence the outcomes. The trainer of the mindfulness sessions unintentionally stressed the importance of this study for the research department’s funding. Participants work extra hard to reduce their anxiety levels during the study as a result.
Hawthorne effect The tendency for participants to change their behaviours simply because they know they are being studied. The participants actively avoid anxiety-inducing situations for the period of the study because they are conscious of their participation in the research.
Testing effect The administration of a pre- or post-test affects the outcomes. Because participants become familiar with the pre-test format and questions, they are less anxious during the post-test and recall less anxiety then.
Aptitude-treatment Interactions between characteristics of the group and individual variables together influence the dependent variable. Interactions between certain characteristics of the participants with depression (e.g., negative thought patterns) and the mindfulness exercises (e.g., focus on the present) improve anxiety levels. The findings are not replicated with people with personality disorders or schizophrenia.
Situation effect Factors like the setting, time of day, location, and researchers’ characteristics limit generalisability of the findings. The study is repeated with one change; the participants practise mindfulness at night rather than in the morning. The outcomes do not show any improvement this time.

How to counter threats to external validity

There are several ways to counter threats to external validity:

  • Replications counter almost all threats by enhancing generalisability to other settings, populations and conditions.
  • Field experiments counter testing and situation effects by using natural contexts.
  • Probability sampling counters selection bias by making sure everyone in a population has an equal chance of being selected for a study sample.
  • Recalibration or reprocessing also counters selection bias using algorithms to correct weighting of factors (e.g., age) within study samples.

Frequently asked questions about external validity

What is external validity?

The external validity of a study is the extent to which you can generalise your findings to different groups of people, situations, and measures.

What are the two types of external validity?

The two types of external validity are population validity (whether you can generalise to other groups of people) and ecological validity (whether you can generalise to other situations and settings).

What are threats to external validity?

There are seven threats to external validity: selection bias, history, experimenter effect, Hawthorne effect, testing effect, aptitude-treatment, and situation effect.

How does attrition threaten external validity?

Attrition bias can skew your sample so that your final sample differs significantly from your original sample. Your sample is biased because some groups from your population are underrepresented.

With a biased final sample, you may not be able to generalise your findings to the original population that you sampled from, so your external validity is compromised.

Is this article helpful?
Pritha Bhandari

Pritha has an academic background in English, psychology and cognitive neuroscience. As an interdisciplinary researcher, she enjoys writing articles explaining tricky research concepts for students and academics.

Still have questions?

Please click the checkbox on the left to verify that you are a not a bot.