What is the median?
The median is the middle value in a sorted list of numbers. It divides a data set into two equal halves, with 50% of values above and 50% below.
Unlike the mean (average), the median is resistant to outliers, making it a better measure of central tendency for skewed distributions.
How to find the median
For an odd number of values
- Sort the numbers from smallest to largest
- The median is the middle number
Example: For 3, 7, 9, 12, 15
The middle value is 9.
For an even number of values
- Sort the numbers from smallest to largest
- Find the two middle numbers
- Calculate their average
Example: For 2, 5, 8, 11
The two middle values are 5 and 8.
Median = (5 + 8) ÷ 2 = 6.5
The formula
For a sorted list of n values:
Odd n:
Median=x2n+1
Even n:
Median=2x2n+x2n+1
Median vs. mean
| Measure | Median | Mean |
|---|
| Calculation | Middle value | Sum ÷ count |
| Affected by outliers | No | Yes |
| Best for | Skewed data | Symmetric data |
| Example use | Income data | Test scores |
When to use median
Use the median when:
- Your data has significant outliers
- The distribution is skewed (not symmetric)
- You want a typical value that isn't pulled by extremes
Example of outlier effect
Consider salaries: $40k, $45k, $50k, $55k, $500k
- Mean: $138,000 (pulled up by the outlier)
- Median: $50,000 (represents the typical salary better)
Related statistics
Mode
The most frequently occurring value. A dataset can have:
- No mode (all values appear once)
- One mode (unimodal)
- Multiple modes (bimodal, multimodal)
Range
The difference between the maximum and minimum values:
Range=Max−Min
Quartiles
Quartiles divide data into four equal parts:
- Q1 (25th percentile): Median of lower half
- Q2 (50th percentile): The median
- Q3 (75th percentile): Median of upper half
Practical applications
Real estate
Home prices are often reported as medians because a few luxury homes would dramatically skew the mean upward.
Income statistics
Median household income is preferred over mean income because wealth concentration affects the average.
Performance metrics
Website load times often use median values because occasional timeouts would distort averages.
Scientific research
Medians are used when data doesn't follow a normal distribution or when outliers are expected.
Properties of the median
- Minimizes absolute deviations: The sum of |x - median| is minimized
- Positional measure: Depends only on the position, not the exact values
- Unique for continuous data: Always produces a single value
- Scale invariant: Multiplying all values by a constant multiplies the median by that constant
Limitations
- Doesn't use all data points (ignores exact values)
- Less efficient statistically than mean for normal distributions
- Can be affected by how ties are handled in discrete data
- Doesn't have nice algebraic properties (median of sums ≠ sum of medians)