Call:
lm(formula = mass ~ height, data = starwars_nojabba)
Coefficients:
(Intercept) height
-31.2505 0.6127
. . .
. . .
How did we decide on this line?
starwars_nojabba <- starwars_nojabba |>
mutate(fitted = fitted(lm(mass ~ height, data = starwars_nojabba)))
ggplot(starwars_nojabba, aes(x = height, mass)) +
geom_point(color = "#86a293") +
geom_segment(aes(
x = height,
y = mass,
xend = height,
yend = fitted
),
color = "blue") +
geom_smooth(
method = "lm",
se = FALSE,
formula = "y ~ x",
color = "#86a293"
) +
labs(title = "The relationship between mass and height for Star Wars characters",
caption = "Data from SWAPI (swapi.dev)")
ggplot(starwars_nojabba, aes(x = height, mass)) +
geom_rect(
aes(
xmin = height,
xmax = height + mass - fitted,
ymin = mass,
ymax = fitted
),
fill = "blue",
color = "blue",
alpha = 0.2
) +
geom_smooth(
method = "lm",
se = FALSE,
formula = "y ~ x",
color = "#86a293"
) +
geom_point(color = "#86a293") +
coord_fixed() +
labs(title = "The relationship between mass and height for Star Wars characters",
caption = "Data from SWAPI (swapi.dev)")
ggplot(starwars_nojabba, aes(x = height, mass)) +
geom_rect(
aes(
xmin = height,
xmax = height + mass - fitted,
ymin = mass,
ymax = fitted
),
fill = "blue",
color = "blue",
alpha = 0.2
) +
geom_smooth(
method = "lm",
se = FALSE,
formula = "y ~ x",
color = "#86a293"
) +
geom_point(color = "#86a293") +
coord_fixed() +
labs(title = "The relationship between mass and height for Star Wars characters",
caption = "Data from SWAPI (swapi.dev)")
ggplot(starwars_nojabba, aes(x = height, mass)) +
geom_point(color = "#86a293") +
geom_segment(aes(
x = height,
y = mass,
xend = height,
yend = fitted
),
color = "blue") +
geom_smooth(
method = "lm",
se = FALSE,
formula = "y ~ x",
color = "#86a293"
) +
labs(title = "The relationship between mass and height for Star Wars characters",
caption = "Data from SWAPI (swapi.dev)")
\[\Large \sum(y-\hat{y})^2\]
\[\Large \sum_{i=1}^n(y_i - \hat{y}_i)^2\]
\[\Large e_i = y_i - \hat{y}_i\]
\[\Large e_1 = y_1 - \hat{y}_1\]
Application Exercise
x
and y
. Drag the blue points to change the line.03:00
# A tibble: 58 × 4
mass height y_hat residual
<dbl> <int> <dbl> <dbl>
1 77 172 74.1 2.86
2 75 167 71.1 3.92
3 32 96 27.6 4.43
4 136 202 92.5 43.5
5 49 150 60.7 -11.7
6 120 178 77.8 42.2
7 75 165 69.9 5.15
8 32 97 28.2 3.82
9 84 183 80.9 3.12
10 77 182 80.3 -3.27
# ℹ 48 more rows
How could I add the residual squared to this data frame?
# A tibble: 58 × 4
mass height y_hat residual
<dbl> <int> <dbl> <dbl>
1 77 172 74.1 2.86
2 75 167 71.1 3.92
3 32 96 27.6 4.43
4 136 202 92.5 43.5
5 49 150 60.7 -11.7
6 120 178 77.8 42.2
7 75 165 69.9 5.15
8 32 97 28.2 3.82
9 84 183 80.9 3.12
10 77 182 80.3 -3.27
# ℹ 48 more rows
How could I add the residual squared to this data frame?
# A tibble: 58 × 4
mass height y_hat residual_2
<dbl> <int> <dbl> <dbl>
1 77 172 74.1 8.18
2 75 167 71.1 15.4
3 32 96 27.6 19.6
4 136 202 92.5 1890.
5 49 150 60.7 136.
6 120 178 77.8 1780.
7 75 165 69.9 26.5
8 32 97 28.2 14.6
9 84 183 80.9 9.74
10 77 182 80.3 10.7
# ℹ 48 more rows
How can I summarize this dataset to calculate the sum of the squared residuals?
# A tibble: 58 × 4
mass height y_hat residual_2
<dbl> <int> <dbl> <dbl>
1 77 172 74.1 8.18
2 75 167 71.1 15.4
3 32 96 27.6 19.6
4 136 202 92.5 1890.
5 49 150 60.7 136.
6 120 178 77.8 1780.
7 75 165 69.9 26.5
8 32 97 28.2 14.6
9 84 183 80.9 9.74
10 77 182 80.3 10.7
# ℹ 48 more rows
How can I summarize this dataset to calculate the sum of the squared residuals?
How can I add the total sample size?
How can I add the total sample size?
How can I add the degrees of freedom \((n-p)\)?
How can I add the degrees of freedom \((n-p)\)?
How can I add the total \(\hat{\sigma}_\varepsilon= \sqrt{\frac{\textrm{SSE}}{df}}\)?
How can I add the total \(\hat{\sigma}_\varepsilon= \sqrt{\frac{\textrm{SSE}}{df}}\)?
# A tibble: 1 × 4
sse n df sigma
<dbl> <int> <dbl> <dbl>
1 21276. 58 56 19.5
lm
output
Call:
lm(formula = mass ~ height, data = starwars_nojabba)
Residuals:
Min 1Q Median 3Q Max
-39.006 -7.804 0.508 4.007 57.901
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -31.25047 12.81488 -2.439 0.0179 *
height 0.61273 0.07202 8.508 1.14e-11 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 19.49 on 56 degrees of freedom
Multiple R-squared: 0.5638, Adjusted R-squared: 0.556
F-statistic: 72.38 on 1 and 56 DF, p-value: 1.138e-11
Application Exercise
PorschePrice
data by running ?PorschePrice
in your ConsolePrice
from Mileage
y_hat
to the PorschePrice
dataset with the predicted y valuesresidual
to the PorschePrice
dataset with the residuals07:00