Nonparametric vs. Parametric Analysis
Nonparametric tests do not require that your dataset is normally distributed. Moreover, they can even provide some benefits under certain circumstances. In order to help you choose what suits your statistical hypothesis testing best, I briefly outline a guideline for choosing between a
Parametric analysis to assess population means.
Nonparametric analysis to examine populations medians.
Below, you can find a list of parametric tests and their nonparametric counterparts:
Parametric Tests - Advantages
Advantage 1 - Reliable results: even with distributions that are skewed and nonnormal
Although mostly neglected, parametric analyses can lead to reliable results also when your (continuous) data is not normally distributed. Thanks to the Central Limit Theorem and an array of simulation studies, practitioners need to make sure that their sample size satisfies the requirements for each analysis that are listed in the table below. In fact, most of the listed requirements have been identified by simulation studies.
Advantage 2 - Reliable results: even when groups have different variances
Nonparametric tests do not require data to be normally distributed. However, nonparametric tests have the disadvantage of an additional condition that may be hard to satisfy: groups in a nonparametric analysis typically must all have approximately the same variance. Therefore, nonparametric analyses might fail to provide accurate results for groups with distinct variances. Conversely, parametric analyses, like the 2-sample t-test or One Way ANOVA, allow practitioners to analyze groups with unequal variances. For parametric analyses, it is not an issue to have groups with different variances.
Advantage 3 - Greater statistical power
Finally, most often parametric tests tend to have more power than their nonparametric counterpart. In other words: if there is an effect, parametric tests are more likely to detect it.
Nonparametric Tests - Advantages
Advantage 1 - Assessment of the median (instead of the mean) is sometimes more beneficial
For some datasets, it is more reasonable to examine the median instead of the mean. Nonparametric tests are thus well suited. In fact, the sample mean is not always the best measure of central tendency. And even though a valid parametric analysis can be possible, it is not necessarily the most reasonable way to proceed.
To illustrate, consider a right-skewed (a.k.a. positively skewed) distribution, such as salaries. Most salaries will be clustered around the median. Recall that the median represents the point in data where half of the observations are below, and half of them are above. But given that salaries are right-skewed, the distribution has a long tail: there are few people that earn substantially more than the rest. As a result, the skewness pulls the mean away from the central median.
Now suppose that we intend to examine the distribution of salaries between two subsamples. Although both subsamples will have approximately the same median, their mean will differ. Moreover, when new observations with high-income enter the distribution, the mean will increase substantially, despite that salaries for other individuals in the dataset did not change. Therefore, observations will still most dense around the median.
Under such conditions, both parametric and nonparametric analyses can provide interesting insights. But these results will be more distinct the more skewed the distribution is. Moreover, for sufficiently large sample sizes the difference between the subsample means will be statistically significant. The difference between subsample medians, on the other hand, will not be statistically different.
In short, changes in the tail affect the mean more substantially when distributions are skewed. Parametric tests can detect this change in means. While the median is relatively unaffected, a nonparametric analysis would thus reveal that the median has not changed significantly. In order to choose between a parametric and a nonparametric analysis, practitioners need to decide which measure, mean or median, is more reasonable to assess.
Two right-skewed distributions with the same median but different means.
Advantage 2 - Analyze ordinal data, ranked data, and outliers
Parametric tests are only able to analyze continuous data. And outliers can limit statistical inference. In contrast, nonparametric tests can also analyze ordinal and ranked data. Moreover, they are less affected by outliers. But removing outliers can also be justified in certain cases (e.g. when they represent unusual conditions). However, sometimes outliers can also be a genuine part of the given distribution (especially for highly dispersed data points). Dropping these observations would thus not only simplify statistical inference, but it can also alter the dataset substantially. When using nonparametric tests, pay attention that different nonparametric tests differ in the way they cope with outliers.
Advantage 3 - Validity: even when sample size is small and data potentially non-normal
With small sample sizes, normality tests can have insufficient power for reliable inferences. Nonparametric tests tend to have lower power, and small sample sizes can enhance this problem.
Comparison between Parametric and Nonparametric Tests
Beyond asking yourself what distribution your data follows, there are other checkpoints that need to be met when deciding between a parametric and a nonparametric analysis:
Parametric analyses can analyze non-normal distributions for many datasets.
Nonparametric analyses have other strong assumptions (which sometimes can be harder to fulfill).
The most appropriate choice often depends on whether the mean or median is a better measure of central tendency for the distribution of your data.
If the mean is a better measure and you have a sufficiently large sample size, parametric tests usually are the more powerful choice.
If the median is a better measure, consider a nonparametric test regardless of your sample size.
Last but not least, a tiny sample size might urge practitioners to use a nonparametric test. Nonparametric tests usually require less large sample sizes. But keep in mind that they also have less power. Hence, for lower sample sizes, the chances of detecting an effect are even lower.
Nonparametric tests: cheat sheet
Mann Whitney U Test (Wilcoxon Rank Sum Test): Compare a continuous outcome in two independent samples
Null Hypothesis H0: Two populations are equal
Test statistic U: the smaller of
Sign Test: Compare a continuous outcome in two matched or paired samples
Null Hypothesis H0: Median difference is zero
Test Statistic: The test statistic is the smaller of the number of positive or negative signs
Decision Rule: Reject H0 if the smaller of the number of positive or negative signs ≤ critical value from table
Wilcoxon Signed Rank Test: Compare a continuous outcome in two matched or paired samples
Null Hypothesis H0: Median difference is zero
Test Statistic: The test statistic is W, defined as the smaller of W+ and W- which are the sums of the positive and negative ranks of the difference scores, respectively
Decision Rule: Reject H0 if W ≤ critical value from table
Kruskal Wallis Test: Compare a continuous outcome in more than two independent samples
Null Hypothesis H0: k population medians are equal
Test Statistic: The test statistic is
where k = the number of comparison groups, N = the total sample size, n_j = sample size in the jth group and R_j = sum of the ranks in the jth group
Decision Rule: Reject H0 if H ≥ critical value from table
! Important note !
Keep in mind that nonparametric tests are subject to the same errors as parametric tests. A Type I error occurs when a test incorrectly rejects the null hypothesis. A Type II error occurs when a test fails to a false reject H0. Power is the probability of a test to correctly reject H0. Nonparametric tests can be subject to low power mainly due to small sample size. Thus, it is important to consider the possibility of a Type II error when a nonparametric test fails to reject H0: there may be a true effect or difference, yet the nonparametric test is underpowered to detect it. For more details, see Conover (1998), and Siegel and Castellan (1988).