class: center, middle, inverse, title-slide # Meet the toolkit:
programming ##
Introduction to Data Science with R and Tidyverse ### based on datasciencebox.org --- layout: true <div class="my-footer"> <span> Introduction to Data Science with R and Tidyverse | Lukas Jürgensmeier, Matteo Fina, Jan Bischoff | based on <a href="https://datasciencebox.org" target="_blank">datasciencebox.org</a> </span> </div> --- ## Course toolkit <br> .pull-left[ ### .gray[Course operation] .gray[ - [Course website](https://github.com/coding-intro/intro-tidyverse-2023-06) - E-Mail(s) ] ] .pull-right[ ### .pink[Doing data science] - .pink[Programming:] - .pink[R] - .pink[RStudio] - .pink[tidyverse] - .pink[R Markdown] ] --- class: middle # R and RStudio --- ## R and RStudio .pull-left[ <img src="img/r-logo.png" width="25%" style="display: block; margin: auto;" /> - R is an open-source statistical **programming language** - R is also an environment for statistical computing and graphics - It's easily extensible with *packages* ] .pull-right[ <img src="img/rstudio-logo.png" width="50%" style="display: block; margin: auto;" /> - RStudio is a convenient interface for R called an **IDE** (integrated development environment), e.g. *"I write R code in the RStudio IDE"* - RStudio is not a requirement for programming with R, but it's very commonly used by R programmers and data scientists ] --- ## R packages - **Packages** are the fundamental units of reproducible R code. They include reusable R functions, the documentation that describes how to use them, and sample data<sup>1</sup> - As of January 2022, there are over 18,000 R packages available on **CRAN** (the Comprehensive R Archive Network)<sup>2</sup> - We're going to work with a small (but important) subset of these! .footnote[ <sup>1</sup> Wickham and Bryan, [R Packages](https://r-pkgs.org/). <sup>2</sup> [CRAN contributed packages](https://cran.r-project.org/web/packages/). ] --- ## Tour: R and RStudio <img src="img/tour-r-rstudio.png" width="80%" style="display: block; margin: auto;" /> --- ## A short list (for now) of R essentials - Functions are (most often) verbs, followed by what they will be applied to in parentheses: ```r do_this(to_this) do_that(to_this, to_that, with_those) ``` -- - Packages are installed with the `install.packages` function and loaded with the `library` function, once per session: ```r install.packages("package_name") library(package_name) ``` --- ## R essentials (continued) - Columns (variables) in data frames are accessed with `$`: ```r dataframe$var_name ``` -- - Object documentation can be accessed with `?` ```r ?mean ``` --- ## tidyverse .pull-left[ <img src="img/tidyverse.png" width="99%" style="display: block; margin: auto;" /> ] .pull-right[ .center[.large[ [tidyverse.org](https://www.tidyverse.org/) ]] - The **tidyverse** is an opinionated collection of R packages designed for data science - All packages share an underlying philosophy and a common grammar ] --- ## rmarkdown .pull-left[ .center[.large[ [rmarkdown.rstudio.com](https://rmarkdown.rstudio.com/) ]] - **rmarkdown** and the various packages that support it enable R users to write their code and prose in reproducible computational documents - We will generally refer to R Markdown documents (with `.Rmd` extension), e.g. *"Do this in your R Markdown document"* and rarely discuss loading the rmarkdown package ] .pull-right[ <img src="img/rmarkdown.png" width="60%" style="display: block; margin: auto;" /> ] --- class: middle # R Markdown --- ## R Markdown - Fully reproducible reports — each time you knit the analysis runs from the beginning - Simple markdown syntax for text - Code goes in chunks, defined by three backticks, narrative goes outside of chunks --- ## Tour: R Markdown <img src="img/tour-rmarkdown.png" width="90%" style="display: block; margin: auto;" /> --- ## Environments .tip[ The environment of your R Markdown document is separate from the Console! ] Remember this, and expect it to bite you a few times as you're learning to work with R Markdown! --- ## Environments .pull-left[ First, run the following in the console .small[ ```r x <- 2 x * 3 ``` ] .question[ All looks good, eh? ] ] -- .pull-right[ Then, add the following in an R chunk in your R Markdown document .small[ ```r x * 3 ``` ] .question[ What happens? Why the error? ] ] --- ## R Markdown help .pull-left[ .center[ .midi[R Markdown Cheat Sheet `Help -> Cheatsheets`] ] <img src="img/rmd-cheatsheet.png" width="80%" style="display: block; margin: auto;" /> ] .pull-right[ .center[ .midi[Markdown Quick Reference `Help -> Markdown Quick Reference`] ] <img src="img/md-cheatsheet.png" width="80%" style="display: block; margin: auto;" /> ] --- ## How will we use R Markdown? - Every application exercise is an R Markdown document - You'll always have a template R Markdown document to start with - The amount of scaffolding in the template will decrease over the course --- .your-turn[ .light-blue[.hand[Your turn:]] `Application Exercise 02 - Bechdel + R Markdown` - [The Bechdel test](https://en.wikipedia.org/wiki/Bechdel_test) asks whether a work of fiction features at least two women who talk to each other about something other than a man, and there must be two women named characters. - Go to [Posit Cloud](https://posit.cloud/) and start the assignment `application-exercise-02-bechdel-test-rmarkdown`. - Open and knit the R Markdown document `bechdel.Rmd`, review the document, and try to fill in the blanks. ]