Non-Parametric Test with Example
Published:
Non-parametric statistics can be thought of as a set of statistical models that minimize two things:
- The number of assumptions.
- The strength of these assumptions.
For example, parametric tests like the t-test require assumptions such as:
- The data is normally distributed.
- There is enough data to rely on the Central Limit Theorem.
The results of parametric tests are accurate only if these assumptions are met. Non-parametric tests, on the other hand, have relaxed assumptions, making them more robust in certain scenarios.
Basis of Null Hypothesis Significance Testing
Before diving into non-parametric tests, it is important to understand the framework of hypothesis testing:
- Test assumptions.
- Parameters of interest.
- Null hypothesis.
- Test statistic.
- Null distribution.
Example: One-Sample Non-Parametric Test
Imagine we have data on the lying time of cows in a day. The typical lying time is around 12 hours. For 10 cows, the data is:
[12, 11, 8, 7, 10, 12, 13, 5, 12, 16]
Some cows have lying times as low as 5-6 hours, which seem like outliers. However, we cannot remove them as they are important for our analysis. We want to test the null hypothesis:
H₀: The typical lying time is around 12 hours.
The traditional t-test is not suitable here because:
- The data is not normally distributed.
- The sample size is small, so the Central Limit Theorem may not apply.
Wilcoxon’s Signed Rank Test
The Wilcoxon Signed Rank Test is a non-parametric alternative to the t-test. It tests whether the center of a distribution differs from a specified value.
Test Assumptions
- The sample size ( n ) comes from some cumulative distribution function (CDF): \((X_1, \dots, X_n) \sim F\)
- The distribution is continuous and symmetric.
- The parameter of interest is the center of the distribution, ( \theta ).
Each observation can be written as: \(X_i = \theta + \epsilon_i\) where ( \theta ) is the center of the distribution and ( \epsilon_i ) represents noise. This assumes a symmetric distribution, like the t-distribution.
Null Hypothesis
The null hypothesis is: \(H_0: \theta = \theta_0\) This is analogous to the t-test, where we test: \(H_0: \mu = \mu_0\)
Test Statistic
The test statistic is calculated as: \(T = \sum_{i=1}^n \text{sign}(Y_i) \cdot R(|Y_i|)\) where:
- ( Y_i = X_i - \theta_0 )
( R( Y_i ) ) is the rank of ( Y_i ).
Under the null hypothesis:
- ( E[Y_i] = 0 )
- ( E[X_i] = \theta_0 )
If the null hypothesis is true, ( T ) should be close to 0. If not, ( T ) will have a high positive or negative value.
Implementation in R
We can perform this test in R using the wilcox.test function:
# Data
data <- c(12, 11, 8, 7, 10, 12, 13, 5, 12, 16)
# Wilcoxon Signed Rank Test
wilcox.test(data, mu = 12)

