1 / 17
Interpret Data in Context | Lesson 2 of 2

Outliers: Effect and the Keep-or-Remove Choice

Lesson 2 of 2: Interpret Data in Context

In this lesson:

  • See how an outlier distorts a conclusion
  • Decide whether to keep or remove an outlier
Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

Learning Objectives for This Unit

By the end of this unit, you should be able to:

  1. Describe shape, center, and spread in context
  2. Name the direction of skew correctly
  3. Interpret two-group differences in context
  4. Explain how an outlier affects each statistic
  5. Decide whether to keep or remove an outlier
Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

Should You Take This Job?

A job posting brags: the average employee earns $90,000.

  • Sounds great — should you take the job on that number?
  • Be suspicious of any "average" without the data's shape

That $90k figure can be true and still mislead. Let's see why.

Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

The Salary Data Behind the Average

Salary dot plot: nine points clustered between 40k and 60k dollars, one lone point at 500k dollars labeled owner, with mean and median markers

  • Nine employees earn $40k–$60k (a tight cluster)
  • One owner earns $500k (a lone outlier)

Predict the mean and the median before advancing.

Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

Mean Inflated, but Median Stays Honest

For the salary data:

  • Mean $90k — dragged up by the owner
  • Median $50k — where the nine employees sit

Which describes a typical employee? The median.

The $90k average reflects the owner, not a typical worker.

Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

An Outlier Doesn't Move All Statistics

Recompute all four for the salary data:

  • Mean and SD — jump dramatically (non-resistant)
  • Median and IQR — barely move (resistant)

"The outlier messes up the data" is too crude — it wrecks only mean and SD.

Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

The Standard Deviation Misleads Too

The owner's $500k also inflates the SD.

  • Values now sit far from the inflated mean → SD balloons
  • A spread claim based on SD would also mislead

Among the nine workers, salaries are actually quite consistent.

Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

Misleading Versus Honest: Two Conclusions

Same salary data, two stories:

  • Misleading (mean): "The average salary is $90,000."
  • Honest (median): "A typical employee earns about $50,000."

The outlier's danger is the false story a non-resistant statistic tells.

Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

Your Turn: Correct the Interpretation

Clinic waits: most 10–20 minutes, one emergency at 180 minutes. A report says "average wait is 35 minutes."

Explain why that misleads, then write the corrected interpretation.

Use a resistant statistic, then advance.

Answer: The 180-minute outlier inflated the mean. A typical patient waits about 15 minutes.

Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

So Should You Just Delete It?

An outlier distorts the story — so delete it?

  • That instinct is wrong — and avoiding it is the key judgment
  • Some outliers are genuine, valuable information

Don't delete first. Investigate first — find out where it came from.

Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

Outliers Are Not Automatically Errors

An outlier is just an unusual value — and unusual values come in two kinds:

  • A genuine error (typo, broken sensor)
  • A real, rare event that actually happened

You can't tell which without investigating. Erasing a real value falsifies data.

Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

Two Cases, Two Different Calls

Two-panel contrast: left panel a 7-inch adult height labeled error, right panel a 108-degree summer day labeled real

  • 7-inch adult height → almost certainly a typo → error
  • 108° summer day → a real, rare heat wave → keep it

Same flag, opposite decisions — because the source differs.

Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

The Four-Step Outlier Decision Protocol

When you find an outlier:

  1. Investigate — where did the value come from?
  2. Decide — error or legitimate rare value?
  3. Document — record the choice and why
  4. Report — state its effect on your conclusion

This replaces impulsive deletion with a defensible, transparent choice.

Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

How to Report a Kept Outlier

If you keep a legitimate outlier, report it carefully:

  • Use resistant statistics (median, IQR) for the typical case
  • Report the outlier separately as a flagged observation

Typical highs are around 90°, with one record day at 108°.

The typical case stays honest; the rare event stays visible.

Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

Your Turn: Keep or Remove?

Reaction times: mostly 200–400 ms, one entry reads 9000 ms (nine seconds).

Decide keep or investigate, and justify in one sentence.

Name what you'd check. Decide, then advance.

Answer: Investigate — 9000 ms is implausibly long; check whether the subject was distracted or the timer glitched before keeping or removing.

Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

Full Task: Run the Whole Protocol

Given a data set with one outlier in context:

  1. Investigate its likely source
  2. Decide error or legitimate
  3. Document with a justification
  4. Report its effect on your conclusion

Do the whole protocol unaided.

Answer: A documented keep-or-remove decision, with resistant statistics and a flag if kept.

Grade 9 Statistics | HSS.ID.A.3
Interpret Data in Context | Lesson 2 of 2

Key Takeaways From This Lesson

✓ An outlier hits mean and SD, not median and IQR

✓ The wrong statistic tells a misleading story

✓ Investigate, decide, document, report

⚠️ Outliers can be errors or real values

⚠️ Keep a real outlier, but flag it separately

Next: fitting data to a normal model.

Grade 9 Statistics | HSS.ID.A.3