Understanding P-Values: A Simple and Easy Guide With the Example of Tomato growth

Dooinn KIm
4 min readAug 16, 2023

--

Photo by Josephine Baran on Unsplash

What is a P-value?

Imagine you’ve just planted two types of tomatoes in your garden, and you want to know if one type grows faster than the other. You could use statistics to test this, and that’s where the p-value comes in!

The p-value is like a magical number that helps you decide whether something is likely to be true or just happened by chance. In scientific terms, it’s a measure used to assess the strength of the evidence against a null hypothesis.

The Null Hypothesis

Before diving into p-values, we need to understand what a null hypothesis is. In our tomato example, the null hypothesis would be that both types of tomatoes grow at the same rate. There’s no difference between them.

How Do We Calculate the P-value?

To calculate the p-value, we need to perform a statistical test, like a t-test. Here’s a step-by-step guide using our tomato example:

  1. Collect Data: Measure the growth of both types of tomatoes over time.
  2. Perform a Statistical Test: This will compare the growth rates to see if there’s a significant difference.
  3. Find the P-value: The test will give you the p-value, which tells you the probability that you’d see the observed difference (or more extreme) if the null hypothesis were true.

Interpreting the P-value

Now comes the fun part: interpreting the p-value!

  • If the p-value is less than 0.05: This is usually considered evidence that there’s something going on (i.e., one type of tomato is growing faster). You can reject the null hypothesis.
  • If the p-value is greater than or equal to 0.05: This means there’s not enough evidence to conclude that there’s a difference. You fail to reject the null hypothesis.

Let’s Visualize!

To make this more concrete, let’s create an imaginary scenario where we’ve measured the growth of two types of tomatoes and perform a t-test to find the p-value.

Suppose we have the following data for two types of tomatoes:

  • Tomato Type A: Growth in cm after 10 days: 5, 6, 5, 4, 5
  • Tomato Type B: Growth in cm after 10 days: 7, 8, 7, 6, 7

Let’s calculate the p-value and visualize the data!

# Importing necessary libraries
import numpy as np
from scipy.stats import ttest_ind
import matplotlib.pyplot as plt
# Data for the two types of tomatoes
tomato_type_a = np.array([5, 6, 5, 4, 5])
tomato_type_b = np.array([7, 8, 7, 6, 7])
# Performing a t-test to calculate the p-value
t_stat, p_value = ttest_ind(tomato_type_a, tomato_type_b)
# Plotting the data
plt.figure(figsize=[10, 6])
plt.bar(['Tomato Type A', 'Tomato Type B'], [tomato_type_a.mean(), tomato_type_b.mean()], yerr=[tomato_type_a.std(), tomato_type_b.std()])
plt.ylabel('Growth in cm after 10 days')
plt.title('Comparison of Growth Between Two Types of Tomatoes')
plt.grid(True, axis='y', linestyle=' - ')
plt.show()
p_value

# P-value Result: 0.0020773377112267804

Here’s the comparison between the two types of tomatoes:

As we can see, Tomato Type B seems to grow faster than Tomato Type A.

Now, let’s look at the p-value we calculated: 0.00210.0021.

Since this value is less than 0.050.05, we can conclude that there’s strong evidence that Tomato Type B does indeed grow faster than Tomato Type A. We can reject our null hypothesis that both types of tomatoes grow at the same rate.

Summary

In simple terms, the p-value is like a truth-o-meter for statisticians. It helps you determine whether something is likely to be true or just happened by chance.

  • A small p-value (usually less than 0.050.05) suggests that something interesting is going on, and you can reject the null hypothesis.
  • A large p-value (usually greater than or equal to 0.050.05) means there’s not enough evidence to make a conclusion, so you stick with the null hypothesis.

It’s like planting two types of flowers and watching them grow. The p-value helps you figure out if one is truly blooming faster or if it’s just your imagination!

Remember, while the p-value is a powerful tool, it’s not the only thing to consider when making conclusions. It’s always good to look at the bigger picture, like the context of the study, the sample size, and other relevant factors.

I hope this friendly explanation helps you understand the concept of p-values. Happy gardening, and happy data analyzing! Feel free to ask if you have any more questions.

--

--