RStudio projects
This project is maintained by jocelyng012
Under the supervision of instructors Andrew Dickinson and Colleen O’Briant, I completed RStudio projects for EC 320 at the University of Oregon. These projects were completed by the programming language for statistical computing and graphics - RStudio.
Below is the overview of the projects:
Tibble
Example:
tibble(
sex = c("male", "female", "female"),
study_time = c(8, 4, 4),
grade = c(78, 74, 86)
)
sex | study_time | grade | |
---|---|---|---|
1 | male | 8 | 78 |
2 | female | 4 | 74 |
3 | female | 4 | 86 |
lm
lm()
to estimate models with: log transformations, categorical variables using dummies, and interactions between variables, etcbroom:tidy
and broom:glance()
Example broom:tidy
:
lm(lifeExp ~ gdpPercap, data = gapminder) %>%
broom::tidy(conf.int = TRUE)
# A tibble: 2 × 7
term estimate std.error statistic p.value conf.low conf.high
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 54.0 0.315 171. 0 53.3 54.6
2 gdpPercap 0.000765 0.0000258 29.7 3.57e-156 0.000714 0.000815
line of best fit
Example:
students %>%
ggplot(aes(x = grade1, y = final_grade)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE)
dplyr
summarize()
and group_by()
filter()
, arrange()
, and mutate()
, etcbind_rows()
, bind_cols()
, and left_join()
, etcEX: What percentage of A students study for more than 10 hours per week?
students %>%
filter(final_grade >= 90) %>%
group_by(study_time) %>%
summarize(n = n()) %>%
mutate(percent = n / sum(n))
Outcome table:
# A tibble: 4 × 3
study_time n percent
<fct> <int> <dbl>
1 less than 2H 10 0.294
2 2 - 5H 13 0.382
3 5 - 10H 7 0.206
4 more than 10H 4 0.118
ggplot2
Example plots:
ggplot - aes, geom
Examples plots: