|
|
|
|
|
|
|
|
--- |
|
|
--- |
|
|
output: github_document |
|
|
output: github_document |
|
|
|
|
|
editor_options: |
|
|
|
|
|
chunk_output_type: console |
|
|
--- |
|
|
--- |
|
|
|
|
|
|
|
|
<!-- README.md is generated from README.Rmd. Please edit that file --> |
|
|
<!-- README.md is generated from README.Rmd. Please edit that file --> |
|
|
|
|
|
|
|
|
echo = TRUE, |
|
|
echo = TRUE, |
|
|
warning = FALSE, |
|
|
warning = FALSE, |
|
|
message = FALSE, |
|
|
message = FALSE, |
|
|
|
|
|
fig.path = "man/figures/tidyexplain-", |
|
|
cache = TRUE |
|
|
cache = TRUE |
|
|
) |
|
|
) |
|
|
library(tidyAnimatedVerbs) |
|
|
|
|
|
|
|
|
library(dplyr) |
|
|
|
|
|
library(tidyexplain) |
|
|
|
|
|
set_font_size(11, 26) |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
[gganimate]: https://github.com/thomasp85/gganimate#README |
|
|
[gganimate]: https://github.com/thomasp85/gganimate#README |
|
|
|
|
|
|
|
|
- Tidyr Operations: [`gather()`](#gather), [`spread()`](#spread) |
|
|
- Tidyr Operations: [`gather()`](#gather), [`spread()`](#spread) |
|
|
|
|
|
|
|
|
- Learn more about |
|
|
- Learn more about |
|
|
- [Relational Data](#relational-data) |
|
|
|
|
|
- [gganimate](#gganimate) |
|
|
|
|
|
|
|
|
- [Relational Data](#relational-data) |
|
|
|
|
|
- [gganimate](#gganimate) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Please feel free to use these images for teaching or learning about action verbs from the [tidyverse](https://tidyverse.org). |
|
|
Please feel free to use these images for teaching or learning about action verbs from the [tidyverse](https://tidyverse.org). |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Installing |
|
|
## Installing |
|
|
|
|
|
|
|
|
The library can be installed with |
|
|
|
|
|
```{r, echo=T,eval=F} |
|
|
|
|
|
|
|
|
The in-development version of `tidyexplain` can be installed with `devtools`: |
|
|
|
|
|
|
|
|
|
|
|
```r |
|
|
# install.package("devtools") |
|
|
# install.package("devtools") |
|
|
devtools::install_github("gadenbuie/tidy-animated-verbs") |
|
|
devtools::install_github("gadenbuie/tidy-animated-verbs") |
|
|
|
|
|
|
|
|
|
|
|
library(tidyexplain) |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
## Mutating Joins |
|
|
## Mutating Joins |
|
|
|
|
|
|
|
|
```{r intial-dfs, echo=T} |
|
|
|
|
|
library(tidyAnimatedVerbs) |
|
|
|
|
|
|
|
|
```{r intial-dfs} |
|
|
x <- data_frame( |
|
|
x <- data_frame( |
|
|
id = 1:3, |
|
|
id = 1:3, |
|
|
x = paste0("x", 1:3) |
|
|
x = paste0("x", 1:3) |
|
|
|
|
|
|
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r} |
|
|
x |
|
|
x |
|
|
y |
|
|
y |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
> All rows from `x` where there are matching values in `y`, and all columns from `x` and `y`. |
|
|
> All rows from `x` where there are matching values in `y`, and all columns from `x` and `y`. |
|
|
|
|
|
|
|
|
```{r inner-join, echo=T} |
|
|
|
|
|
|
|
|
```{r inner-join} |
|
|
animate_inner_join(x, y, by = "id") |
|
|
animate_inner_join(x, y, by = "id") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r} |
|
|
inner_join(x, y, by = "id") |
|
|
inner_join(x, y, by = "id") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
> All rows from `x`, and all columns from `x` and `y`. Rows in `x` with no match in `y` will have `NA` values in the new columns. |
|
|
> All rows from `x`, and all columns from `x` and `y`. Rows in `x` with no match in `y` will have `NA` values in the new columns. |
|
|
|
|
|
|
|
|
```{r left-join, echo=T} |
|
|
|
|
|
|
|
|
```{r left-join} |
|
|
animate_left_join(x, y, by = "id") |
|
|
animate_left_join(x, y, by = "id") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r} |
|
|
left_join(x, y, by = "id") |
|
|
left_join(x, y, by = "id") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
> ... If there are multiple matches between `x` and `y`, all combinations of the matches are returned. |
|
|
> ... If there are multiple matches between `x` and `y`, all combinations of the matches are returned. |
|
|
|
|
|
|
|
|
```{r left-join-extra, echo=T} |
|
|
|
|
|
|
|
|
```{r left-join-extra} |
|
|
y_extra <- bind_rows(y, data_frame(id = 2, y = "y5")) |
|
|
y_extra <- bind_rows(y, data_frame(id = 2, y = "y5")) |
|
|
y_extra # has multiple rows with the key from `x` |
|
|
y_extra # has multiple rows with the key from `x` |
|
|
|
|
|
|
|
|
animate_left_join(x, y_extra, by = "id") |
|
|
animate_left_join(x, y_extra, by = "id") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r} |
|
|
left_join(x, y_extra, by = "id") |
|
|
left_join(x, y_extra, by = "id") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
> All rows from y, and all columns from `x` and `y`. Rows in `y` with no match in `x` will have `NA` values in the new columns. |
|
|
> All rows from y, and all columns from `x` and `y`. Rows in `y` with no match in `x` will have `NA` values in the new columns. |
|
|
|
|
|
|
|
|
```{r right-join, echo = T} |
|
|
|
|
|
|
|
|
```{r right-join} |
|
|
animate_right_join(x, y, by = "id") |
|
|
animate_right_join(x, y, by = "id") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r} |
|
|
right_join(x, y, by = "id") |
|
|
right_join(x, y, by = "id") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
> All rows and all columns from both `x` and `y`. Where there are not matching values, returns `NA` for the one missing. |
|
|
> All rows and all columns from both `x` and `y`. Where there are not matching values, returns `NA` for the one missing. |
|
|
|
|
|
|
|
|
```{r full-join, echo=T} |
|
|
|
|
|
|
|
|
```{r full-join} |
|
|
animate_full_join(x, y, by = "id") |
|
|
animate_full_join(x, y, by = "id") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r} |
|
|
full_join(x, y, by = "id") |
|
|
full_join(x, y, by = "id") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
> All rows from `x` where there are matching values in `y`, keeping just columns from `x`. |
|
|
> All rows from `x` where there are matching values in `y`, keeping just columns from `x`. |
|
|
|
|
|
|
|
|
```{r semi-join, echo=T} |
|
|
|
|
|
|
|
|
```{r semi-join} |
|
|
animate_semi_join(x, y, by = "id") |
|
|
animate_semi_join(x, y, by = "id") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r} |
|
|
semi_join(x, y, by = "id") |
|
|
semi_join(x, y, by = "id") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
> All rows from `x` where there are not matching values in `y`, keeping just columns from `x`. |
|
|
> All rows from `x` where there are not matching values in `y`, keeping just columns from `x`. |
|
|
|
|
|
|
|
|
```{r anti-join, echo=T} |
|
|
|
|
|
|
|
|
```{r anti-join} |
|
|
animate_anti_join(x, y, by = "id") |
|
|
animate_anti_join(x, y, by = "id") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r} |
|
|
anti_join(x, y, by = "id") |
|
|
anti_join(x, y, by = "id") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
## Set Operations |
|
|
## Set Operations |
|
|
|
|
|
|
|
|
```{r intial-dfs-so, echo=T} |
|
|
|
|
|
|
|
|
```{r intial-dfs-so} |
|
|
x <- data_frame( |
|
|
x <- data_frame( |
|
|
x = c(1, 1, 2), |
|
|
x = c(1, 1, 2), |
|
|
y = c("a", "b", "a") |
|
|
y = c("a", "b", "a") |
|
|
|
|
|
|
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r} |
|
|
x |
|
|
x |
|
|
y |
|
|
y |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
> All unique rows from `x` and `y`. |
|
|
> All unique rows from `x` and `y`. |
|
|
|
|
|
|
|
|
```{r union, echo=T} |
|
|
|
|
|
|
|
|
```{r union} |
|
|
animate_union(x, y) |
|
|
animate_union(x, y) |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r} |
|
|
union(x, y) |
|
|
union(x, y) |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r union-y-x} |
|
|
animate_union(y, x) |
|
|
animate_union(y, x) |
|
|
|
|
|
|
|
|
union(y, x) |
|
|
union(y, x) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
> All rows from `x` and `y`, keeping duplicates. |
|
|
> All rows from `x` and `y`, keeping duplicates. |
|
|
|
|
|
|
|
|
```{r union-all, echo=T} |
|
|
|
|
|
|
|
|
```{r union-all} |
|
|
animate_union_all(x, y) |
|
|
animate_union_all(x, y) |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r} |
|
|
union_all(x, y) |
|
|
union_all(x, y) |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
> Common rows in both `x` and `y`, keeping just unique rows. |
|
|
> Common rows in both `x` and `y`, keeping just unique rows. |
|
|
|
|
|
|
|
|
```{r intersect, echo=T} |
|
|
|
|
|
|
|
|
```{r intersect} |
|
|
animate_intersect(x, y) |
|
|
animate_intersect(x, y) |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r} |
|
|
intersect(x, y) |
|
|
intersect(x, y) |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
> All rows from `x` which are not also rows in `y`, keeping just unique rows. |
|
|
> All rows from `x` which are not also rows in `y`, keeping just unique rows. |
|
|
|
|
|
|
|
|
```{r setdiff, echo=T} |
|
|
|
|
|
|
|
|
```{r setdiff} |
|
|
animate_setdiff(x, y) |
|
|
animate_setdiff(x, y) |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r} |
|
|
setdiff(x, y) |
|
|
setdiff(x, y) |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```{r echo=TRUE} |
|
|
|
|
|
|
|
|
```{r setdiff-y-x} |
|
|
animate_setdiff(y, x) |
|
|
animate_setdiff(y, x) |
|
|
|
|
|
|
|
|
setdiff(y, x) |
|
|
setdiff(y, x) |
|
|
|
|
|
|
|
|
you organize your data into tidy data. |
|
|
you organize your data into tidy data. |
|
|
|
|
|
|
|
|
```{r} |
|
|
```{r} |
|
|
|
|
|
library(tidyr) |
|
|
|
|
|
|
|
|
long <- data_frame( |
|
|
long <- data_frame( |
|
|
year = c(2010, 2011, 2010, 2011, 2010, 2011), |
|
|
year = c(2010, 2011, 2010, 2011, 2010, 2011), |
|
|
person = c("Alice", "Alice", "Bob", "Bob", "Charlie", "Charlie"), |
|
|
person = c("Alice", "Alice", "Bob", "Bob", "Charlie", "Charlie"), |
|
|
|
|
|
|
|
|
) |
|
|
) |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### Gather |
|
|
### Gather |
|
|
|
|
|
|
|
|
> Gather takes multiple columns and collapses into key-value pairs, duplicating all other columns as needed. You use gather() when you notice that your column names are not names of variables, but values of a variable. |
|
|
> Gather takes multiple columns and collapses into key-value pairs, duplicating all other columns as needed. You use gather() when you notice that your column names are not names of variables, but values of a variable. |
|
|
|
|
|
|
|
|
```{r} |
|
|
|
|
|
|
|
|
```{r gather} |
|
|
animate_gather(wide, key = "person", value = "sales", -year) |
|
|
animate_gather(wide, key = "person", value = "sales", -year) |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
gather(wide, key = "person", value = "sales", -year) |
|
|
gather(wide, key = "person", value = "sales", -year) |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### Spread |
|
|
### Spread |
|
|
|
|
|
|
|
|
> Spread a key-value pair across multiple columns. Use it when an a column contains observations from multiple variables. |
|
|
> Spread a key-value pair across multiple columns. Use it when an a column contains observations from multiple variables. |
|
|
|
|
|
|
|
|
```{r} |
|
|
|
|
|
|
|
|
```{r spread} |
|
|
animate_spread(long, key = "person", value = "sales") |
|
|
animate_spread(long, key = "person", value = "sales") |
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
``` |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
## Learn More |
|
|
## Learn More |
|
|
|
|
|
|
|
|
### Relational Data |
|
|
### Relational Data |