Mandy
You should know now:
t.test()
and the different kinds of usagey ~ x
in R mean y dependent on x (formula syntax)For the babies
data set, the variable age
contains the recorded mom's age and dage
contains the dad's age for several different cases in the sample. Do a significance test of the null hypothesis of equal ages against a one-sided alternative that the dads are older in the sampled population.
A data frame with 14 observations on 2 variables.
Variable | content |
---|---|
ozone: | athmospheric ozone concentration |
garden: | garden id |
Var | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ozone | 9 | 7 | 6 | 8 | 5 | 11 | 9 | 11 | 9 | 6 | 10 | 8 | 8 | 12 |
garden | a | a | a | b | a | b | b | b | b | a | b | a | a | b |
Source: M. Crawley, The R-Book
read.table()
commandgardens <- read.table("session5dat/gardens2.txt", header = T)
head(gardens)
## ozone garden index
## 1 6 a 10
## 2 8 a 7
## 3 5 a 1
## 4 9 a 4
## 5 7 a 5
## 6 8 a 6
\[SSY = \sum(y-\bar{y})^2\]
garden | mean |
---|---|
a | 7 |
b | 10 |
When the means are significantly different then the sum of squares computed from the individual garden means will be smaller than the sum of squares computed from the overall mean.
\[ SSY = SSE + SSA \]
where
Source | Sum of squares | Degrees of freedom | Mean square | F ratio |
---|---|---|---|---|
Garden | \(31.5\) | \(1\) | \(31.5\) | \(15.75\) |
Error | \(24.0\) | \(12\) | \(s^2=2.0\) | |
Total | \(55.5\) | \(13\) |
1 - pf(15.75,1,12)
## [1] 0.001864103
lm()
command anda ~ b
mm <- lm(ozone ~ garden, data = gardens)
mm
##
## Call:
## lm(formula = ozone ~ garden, data = gardens)
##
## Coefficients:
## (Intercept) gardenb
## 7 3
summary(mm)
##
## Call:
## lm(formula = ozone ~ garden, data = gardens)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2 -1 0 1 2
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.0000 0.5345 13.096 1.82e-08 ***
## gardenb 3.0000 0.7559 3.969 0.00186 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.414 on 12 degrees of freedom
## Multiple R-squared: 0.5676, Adjusted R-squared: 0.5315
## F-statistic: 15.75 on 1 and 12 DF, p-value: 0.001864
anova(mm)
## Analysis of Variance Table
##
## Response: ozone
## Df Sum Sq Mean Sq F value Pr(>F)
## garden 1 31.5 31.5 15.75 0.001864 **
## Residuals 12 24.0 2.0
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1