The Question: Out of Whom?
Before every computation, ask: out of whom?
- Out of everyone? → divide by the grand total
- Out of a subgroup? → divide by that subgroup's total
Name the denominator before you compute — every time.
Flip the Condition: Different Question
Among students who prefer online, what fraction are 9th graders?
- This conditions on online-preferrers — uses the column total
- "Among 9th graders, who's online?" started from grade
Different starting subgroups, different questions — the direction matters.
Row-Conditional Is Not the Column-Conditional
These are different questions with different denominators:
- Row: fraction of 9th graders who prefer online
- Column: fraction of online-preferrers who are 9th graders
Like "A given B" versus "B given A" — generally not the same number.
Build a Full Row-Conditional Distribution
The 9th-grade row (40 students), as percentages of 9th graders:
- 45% online, 55% in-person → sums to 100%
Each row becomes a distribution of preference within that grade.
Compare rows next — that's how we find association.
Quick Check: Three from One Cell
For the cell 18, compute all three:
- Joint:
(of all) - Row-conditional:
(of 9th graders) - Column-conditional:
(of online-preferrers)
Same numerator, three denominators. Identify each, then advance.
Compare Rows to Find Association
We have a preference distribution for each grade. To find association:
- Compare the conditional distributions across grades
- 9th: 45% online vs 10th: 70% online → they differ
A substantial difference signals a possible association.
Association Versus No Association, Side by Side
- Differ (45% vs 70%) → possible association
- Match (50% vs 50%) → little or no association
When Conditionals Match: No Link
Matching conditional distributions mean no association:
- 50% of 9th graders and 50% of 10th graders prefer online
- Knowing the grade tells you nothing extra about preference
The variables are unrelated — conditioning doesn't shift the distribution.
Compare Conditionals, Not Raw Counts
Detect association from conditional distributions, never raw counts.
- Subgroups can be different sizes
- Raw counts reflect group size, not the relationship
Convert to percentages within each group, then compare.
Association Is Not the Same as Causation
A difference in conditional distributions shows association, not cause.
- Association = the variables tend to occur together in a pattern
- Causation requires far more — a controlled study (HSS.ID.C.9)
Keep the claim modest: a possible association, not a cause.
Your Turn: Detect a Possible Association
A crash study by seatbelt use:
- Seatbelt-wearers: 10% seriously injured
- Non-wearers: 40% seriously injured
Do the variables show a possible association?
Compare the conditionals, then advance.
Answer: Yes — the conditional distributions differ sharply (10% vs 40%), signaling a possible association.
Full Task: Conditionals and Association
A table relating study method (flashcards/rereading) to passing (pass/fail):
- Build the row-conditional pass-rates
- Compare them across methods
- Decide: possible association?
Phrase it as association, not cause. Do it all, then advance.
Answer: Compare the two pass-rates; if they differ substantially, report a possible association.
Key Takeaways From This Lesson
✓ A conditional RF divides by a subgroup total — "out of whom?"
✓ Row and column conditionals are different questions
✓ Compare conditional distributions to detect association
Compare conditionals, not raw counts
Association is not causation (see C.9)
Next: probability and independence from the same table.
Click to begin the narrated lesson
Summarize categorical data in two-way frequency tables