ISS608-VAA
  • ✌️ Hands-on Exercises
    • Hands-on Exercise 1
    • Hands-on Exercise 2
    • Hands-on Exercise 3a
    • Hands-on Exercise 3b
    • Hands-on Exercise 4a
    • Hands-on Exercise 4b
    • Hands-on Exercise 4c
    • Hands-on Exercise 4d
    • Hands-on Exercise 5
    • Hands-on Exercise 6
    • Hands-on Exercise 7
    • Hands-on Exercise 8
    • Hands-on Exercise 9
    • Hands-on Exercise 10
  • πŸ‘¨πŸ»β€πŸ«In-class Exercises
    • In-class Exercise 1
    • In-class Exercise 3
    • In-class Exercise 4
  • 🏠 Take-home Exercises
    • Take-home Exercise 1
    • Take-home Exercise 2
    • Take-home Exercise 3
    • Take-home Exercise 4
  • Home
  • My VAA Journey

On this page

  • Overview Summary
  • Getting Started
    • 1: Data
      • 1.1 Installing and loading the required libraries
      • 1.2 Data Set
  • 2: Hands-on Exercise
    • 2.1 Interactive Data Visualisation - ggiraph methods
      • 2.1.2 Tooltip
      • 2.1.3 Hover
      • 2.1.4 Tooltip + Hover
      • 2.1.5 Click effect
      • 2.1.6 Coordinated Multiple Views
    • 2.2 Interactive Data Visualisation - plotly methods!
      • 2.2.1 Creating an interactive scatter plot: plot_ly() method
      • 2.2.2 Working with visual variable: plot_ly() method
      • 2.2.3 Coordinated Multiple Views with plotly
    • 2.3 Interactive Data Visualisation - crosstalk methods!
      • 2.3.1 Interactive Data Table: DT package
      • 2.3.2 Linked brushing: crosstalk method

Hands-on Exercise 3a

  • Show All Code
  • Hide All Code

  • View Source

Lesson 3: Programming Interactive Data Visualisation with R

Author

Victoria Neo

Published

January 24, 2024

Modified

January 25, 2024

(ref:ah-ggplot) Rawpixel.com/Shutterstock.com. Naveen Neelakandan’s Interactive Learning Content In eLearning: How Effective Is It?

Overview Summary

Work done Hands-on Exercise 3a
Hours taken ⏱️⏱️⏱️⏱️ (sick fil)
Questions 0
How do I feel? 🚚😡
What do I think? This week’s Hands-on Exercises were a little long as we still had our Take-home Exercise 2 to think about and work on. That aside, I was very interactivity is very important because enabling the reader to explore helps to democratise data.

Getting Started

1: Data

1.1 Installing and loading the required libraries

The code chunk below uses p_load() of pacman package to check if the following R packages are installed in the computer. If they are, then they will be launched into R.

  • ggiraph for making β€˜ggplot’ graphics interactive.

  • plotly, R library for plotting interactive statistical graphs.

  • DT provides an R interface to the JavaScript library DataTables that create interactive table on html page.

  • tidyverse, a family of modern R packages specially designed to support data science, analysis and communication task including creating static statistical graphs.

  • patchwork for combining multiple ggplot2 graphs into one figure.

code block
pacman::p_load(ggiraph, plotly, 
               patchwork, DT, tidyverse,
               kableExtra) 

1.2 Data Set

Note

This section is taken from Hands-on_Ex02 as we are using the same data set.

The data set, Exam_data.csv, contains the Year-end examination grades of a cohort of primary 3 students from a local school, and is uploaded as exam_data.

1.2.1 Importing exam_data

In the code chunk below, read_csv() of readr package is used to import Exam_data.csv data file into R and save it as an tibble data frame called exam_data.

code block
exam_data <- read_csv("data/Exam_data.csv")
Rows: 322 Columns: 7
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): ID, CLASS, GENDER, RACE
dbl (3): ENGLISH, MATHS, SCIENCE

β„Ή Use `spec()` to retrieve the full column specification for this data.
β„Ή Specify the column types or set `show_col_types = FALSE` to quiet this message.

1.2.2 Summary Statistic of exam_data

  • Data Table
  • Data Structure
  • Data Health

Displaying the first 5 rows of exam_data using head():

code block
head(exam_data,5) %>%
  kbl() %>%
  kable_material()
ID CLASS GENDER RACE ENGLISH MATHS SCIENCE
Student321 3I Male Malay 21 9 15
Student305 3I Female Malay 24 22 16
Student289 3H Male Chinese 26 16 16
Student227 3F Male Chinese 27 77 31
Student318 3I Male Malay 27 11 25

Checking the structure of exam_data using str():

code block
str(exam_data)
spc_tbl_ [322 Γ— 7] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ ID     : chr [1:322] "Student321" "Student305" "Student289" "Student227" ...
 $ CLASS  : chr [1:322] "3I" "3I" "3H" "3F" ...
 $ GENDER : chr [1:322] "Male" "Female" "Male" "Male" ...
 $ RACE   : chr [1:322] "Malay" "Malay" "Chinese" "Chinese" ...
 $ ENGLISH: num [1:322] 21 24 26 27 27 31 31 31 33 34 ...
 $ MATHS  : num [1:322] 9 22 16 77 11 16 21 18 19 49 ...
 $ SCIENCE: num [1:322] 15 16 16 31 25 16 25 27 15 37 ...
 - attr(*, "spec")=
  .. cols(
  ..   ID = col_character(),
  ..   CLASS = col_character(),
  ..   GENDER = col_character(),
  ..   RACE = col_character(),
  ..   ENGLISH = col_double(),
  ..   MATHS = col_double(),
  ..   SCIENCE = col_double()
  .. )
 - attr(*, "problems")=<externalptr> 
Note

There are 322 rows and 7 variables. The output reveals that the variables have been assigned their correct data types.

Checking for any symptoms of messy data:

1. Checking for duplicates:

code block
exam_data[duplicated(exam_data),]
# A tibble: 0 Γ— 7
# β„Ή 7 variables: ID <chr>, CLASS <chr>, GENDER <chr>, RACE <chr>,
#   ENGLISH <dbl>, MATHS <dbl>, SCIENCE <dbl>
Note

There were no duplicated rows found in exam_data.

2. Checking missing values:

code block
sum(is.na(exam_data))
[1] 0
Note

There were no missing values found in exam_data.

3. Checking for String inconsistencies:

3.1 In CLASS

code block
unique(exam_data$CLASS)
[1] "3I" "3H" "3F" "3G" "3E" "3C" "3D" "3A" "3B"

3.2 In GENDER

code block
unique(exam_data$GENDER)
[1] "Male"   "Female"

3.3 In RACE

code block
unique(exam_data$RACE)
[1] "Malay"   "Chinese" "Indian"  "Others" 
Note

There were no string inconsistencies found in exam_data.

4 Checking for Data Irregularities:

4.1 In ENGLISH

code block
summary(exam_data$ENGLISH)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  21.00   59.00   70.00   67.18   78.00   96.00 

4.2 In MATHS

code block
summary(exam_data$MATHS)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   9.00   58.00   74.00   69.33   85.00   99.00 

4.3 In SCIENCE

code block
summary(exam_data$SCIENCE)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  15.00   49.25   65.00   61.16   74.75   96.00 
Note

There were no data irregularities found in exam_data.

2: Hands-on Exercise

2.1 Interactive Data Visualisation - ggiraph methods

What did Prof Kam say?

ggiraph is an html widget and a ggplot2 extension. It allows ggplot graphics to be interactive.

Interactive is made with ggplot geometries that can understand three arguments:

  • Tooltip: a column of data-sets that contain tooltips to be displayed when the mouse is over elements.

  • Onclick: a column of data-sets that contain a JavaScript function to be executed when elements are clicked.

  • Data_id: a column of data-sets that contain an id to be associated with elements.

If it used within a shiny application, elements associated with an id (data_id) can be selected and manipulated on client and server sides. Refer to this article for more detail explanation.

2.1.2 Tooltip

  • Tooltip effect with tooltip aesthetic
  • Displaying multiple information on tooltip
  • Displaying multiple information on tooltip
  • Customising Tooltip style
  • Displaying statistics on tooltip

Below shows a typical code chunk to plot an interactive statistical graph by using ggiraph package. Notice that the code chunk consists of two parts. First, an interactive version of ggplot2 geom (i.e. geom_dotplot_interactive()) will be used to create the basic graph. Then, girafe() will be used to generate an svg object to be displayed on an html page.

code block
p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = ID),
    stackgroups = TRUE, 
    binwidth = 1, 
    method = "histodot") +
  scale_y_continuous(NULL, 
                     breaks = NULL)
girafe(
  ggobj = p,
  width_svg = 6,
  height_svg = 6*0.618
)

The content of the tooltip can be customised by including a list object as shown in the code chunk below. The first three lines of codes in the code chunk create a new field called tooltip. At the same time, it populates text in ID and CLASS fields into the newly created field. Next, this newly created field is used as tooltip field as shown in the code of line 7.

code block
exam_data$tooltip <- c(paste0(     
  "Name = ", exam_data$ID,         
  "\n Class = ", exam_data$CLASS)) 

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = exam_data$tooltip), 
    stackgroups = TRUE,
    binwidth = 1,
    method = "histodot") +
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(
  ggobj = p,
  width_svg = 8,
  height_svg = 8*0.618
)

The content of the tooltip can be customised by including a list object as shown in the code chunk below. The first three lines of codes in the code chunk create a new field called tooltip. At the same time, it populates text in ID and CLASS fields into the newly created field. Next, this newly created field is used as tooltip field as shown in the code of line 7.

code block
exam_data$tooltip <- c(paste0(     
  "Name = ", exam_data$ID,         
  "\n Class = ", exam_data$CLASS)) 

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = exam_data$tooltip), 
    stackgroups = TRUE,
    binwidth = 1,
    method = "histodot") +
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(
  ggobj = p,
  width_svg = 8,
  height_svg = 8*0.618
)

The code chunk below uses opts_tooltip() of ggiraph to customize tooltip rendering by add css declarations. Notice that the background colour of the tooltip is black and the font colour is white and bold.

code block
tooltip_css <- "background-color:white; #<<
font-style:bold; color:black;" #<<

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(tooltip = ID),                   
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(    #<<
    opts_tooltip(    #<<
      css = tooltip_css)) #<<
)

The code chunk below shows an advanced way to customise tooltip. In this example, a function is used to compute 90% confident interval of the mean. The derived statistics are then displayed in the tooltip.

code block
tooltip <- function(y, ymax, accuracy = .01) {
  mean <- scales::number(y, accuracy = accuracy)
  sem <- scales::number(ymax - y, accuracy = accuracy)
  paste("Mean maths scores:", mean, "+/-", sem)
}

gg_point <- ggplot(data=exam_data, 
                   aes(x = RACE),
) +
  stat_summary(aes(y = MATHS, 
                   tooltip = after_stat(  
                     tooltip(y, ymax))),  
    fun.data = "mean_se", 
    geom = GeomInteractiveCol,  
    fill = "light blue"
  ) +
  stat_summary(aes(y = MATHS),
    fun.data = mean_se,
    geom = "errorbar", width = 0.2, size = 0.2
  )

girafe(ggobj = gg_point,
       width_svg = 8,
       height_svg = 8*0.618)

2.1.3 Hover

  • Hover effect with data_id aesthetic
  • Styling hover effect

Code chunk below shows the second interactive feature of ggiraph, namely data_id. Elements associated with a data_id (i.e CLASS) will be highlighted upon mouse over.

code block
p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(           
    aes(data_id = CLASS),             
    stackgroups = TRUE,               
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618                      
)        

In the code chunk below, css codes are used to change the highlighting effect.Elements associated with a data_id (i.e CLASS) will be highlighted upon mouse over.

code block
p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(data_id = CLASS),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(                        
    opts_hover(css = "fill: #202020;"),  
    opts_hover_inv(css = "opacity:0.2;") 
  )                                        
)   

Note: Different from previous example, in this example the ccs customisation request are encoded directly.

2.1.4 Tooltip + Hover

  • Combining tooltip and hover effect

Elements associated with a data_id (i.e CLASS) will be highlighted upon mouse over. At the same time, the tooltip will show the CLASS.

code block
p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(tooltip = CLASS, 
        data_id = CLASS),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(                        
    opts_hover(css = "fill: #202020;"),  
    opts_hover_inv(css = "opacity:0.2;") 
  )                                        
)      

2.1.5 Click effect

  • Click effect with onclick

onclick argument of ggiraph provides hotlink interactivity on the web. Web document link with a data object will be displayed on the web browser upon mouse click.

code block
exam_data$onclick <- sprintf("window.open(\"%s%s\")",
"https://www.moe.gov.sg/schoolfinder?journey=Primary%20school",
as.character(exam_data$ID))

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(onclick = onclick),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618) 

Note that click actions must be a string column in the dataset containing valid javascript instructions.

2.1.6 Coordinated Multiple Views

  • Coordinated Multiple Views with ggiraph

In order to build a coordinated multiple views as shown in the example below, the following programming strategy will be used:

Appropriate interactive functions of ggiraph will be used to create the multiple views. patchwork function of patchwork package will be used inside girafe function to create the interactive coordinated multiple views.

Notice that when a data point of one of the dotplot is selected, the corresponding data point ID on the second data visualisation will be highlighted too.

The data_id aesthetic is critical to link observations between plots and the tooltip aesthetic is optional but nice to have when mouse over a point.

code block
p1 <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(data_id = ID),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +  
  coord_cartesian(xlim=c(0,100)) + 
  scale_y_continuous(NULL,               
                     breaks = NULL)

p2 <- ggplot(data=exam_data, 
       aes(x = ENGLISH)) +
  geom_dotplot_interactive(              
    aes(data_id = ID),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") + 
  coord_cartesian(xlim=c(0,100)) + 
  scale_y_continuous(NULL,               
                     breaks = NULL)

girafe(code = print(p1 + p2), 
       width_svg = 6,
       height_svg = 3,
       options = list(
         opts_hover(css = "fill: #202020;"),
         opts_hover_inv(css = "opacity:0.2;")
         )
       )  

2.2 Interactive Data Visualisation - plotly methods!

What did Prof Kam say?

Plotly’s R graphing library create interactive web graphics from ggplot2 graphs and/or a custom interface to the (MIT-licensed) JavaScript library plotly.js inspired by the grammar of graphics. Different from other plotly platform, plot.R is free and open source.

There are two ways to create interactive graph by using plotly, they are:

  • by using plot_ly(),
  • and by using ggplotly()

2.2.1 Creating an interactive scatter plot: plot_ly() method

  • The plot
  • The code
code block
plot_ly(data = exam_data, 
             x = ~MATHS, 
             y = ~ENGLISH)

2.2.2 Working with visual variable: plot_ly() method

In the code chunk below, color argument is mapped to a qualitative visual variable (i.e. RACE).

  • The plot
  • The code
code block
plot_ly(data = exam_data, 
        x = ~ENGLISH, 
        y = ~MATHS, 
        color = ~RACE)

2.2.3 Coordinated Multiple Views with plotly

The creation of a coordinated linked plot by using plotly involves three steps:

  • highlight_key() of plotly package is used as shared data.

  • two scatterplots will be created by using ggplot2 functions.

  • lastly, subplot() of plotly package is used to place them next to each other side-by-side.

Thing to learn from the code chunk:

  • highlight_key() simply creates an object of class crosstalk::SharedData.

  • Visit this link to learn more about crosstalk,

  • The plot
  • The code
code block
d <- highlight_key(exam_data)
p1 <- ggplot(data=d, 
            aes(x = MATHS,
                y = ENGLISH)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

p2 <- ggplot(data=d, 
            aes(x = MATHS,
                y = SCIENCE)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))
subplot(ggplotly(p1),
        ggplotly(p2))

2.3 Interactive Data Visualisation - crosstalk methods!

What did Prof Kam say?

Crosstalk is an add-on to the htmlwidgets package. It extends htmlwidgets with a set of classes, functions, and conventions for implementing cross-widget interactions (currently, linked brushing and filtering).

2.3.1 Interactive Data Table: DT package

  • A wrapper of the JavaScript Library DataTables

  • Data objects in R can be rendered as HTML tables using the JavaScript library β€˜DataTables’ (typically via R Markdown or Shiny).

code block
DT::datatable(exam_data, class= "compact")

2.3.2 Linked brushing: crosstalk method

Things to learn from the code chunk:

  • highlight() is a function of plotly package. It sets a variety of options for brushing (i.e., highlighting) multiple plots. These options are primarily designed for linking multiple plotly graphs, and may not behave as expected when linking plotly to another htmlwidget package via crosstalk. In some cases, other htmlwidgets will respect these options, such as persistent selection in leaflet.

  • bscols() is a helper function of crosstalk package. It makes it easy to put HTML elements side by side. It can be called directly from the console but is especially designed to work in an R Markdown document. Warning: This will bring in all of Bootstrap!.

  • The plot
  • The code
code block
d <- highlight_key(exam_data) 
p <- ggplot(d, 
            aes(ENGLISH, 
                MATHS)) + 
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

gg <- highlight(ggplotly(p),        
                "plotly_selected")  

crosstalk::bscols(gg,               
                  DT::datatable(d), 
                  widths = 5)        
Source Code
---
title: "Hands-on Exercise 3a"
subtitle: "Lesson 3: [Programming Interactive Data Visualisation with R](https://r4va.netlify.app/chap03)" 
author: "Victoria Neo"
date: 01/24/2024
date-modified: 01/25/2024
format:
  html:
    code-fold: true
    code-summary: "code block"
    code-tools: true
    code-copy: true
---

![](images/clipboard-3313650268.png){width="430"}

(ref:ah-ggplot) *Rawpixel.com/Shutterstock.com.* Naveen Neelakandan's [**Interactive Learning Content In eLearning: How Effective Is It?**](https://elearningindustry.com/interactive-learning-content-elearning-how-effective-is-it)

# Overview Summary

|                  |                                                                                                                                                                                                                                                 |
|--------------------------|----------------------------------------------|
| Work done        | Hands-on Exercise 3a                                                                                                                                                                                                                            |
| Hours taken      | ⏱️⏱️⏱️⏱️ (sick fil)                                                                                                                                                                                                                             |
| Questions        | 0                                                                                                                                                                                                                                               |
| How do I feel?   | 🚚😡                                                                                                                                                                                                                                            |
| What do I think? | This week's Hands-on Exercises were a little long as we still had our Take-home Exercise 2 to think about and work on. That aside, I was very interactivity is very important because enabling the reader to explore helps to democratise data. |

# Getting Started

## 1: Data

### 1.1 Installing and loading the required libraries

::: {.codebox .code data-latex="code"}
The code chunk below uses p_load() of pacman package to check if the following R packages are installed in the computer. If they are, then they will be launched into R.

-   [**ggiraph**](https://davidgohel.github.io/ggiraph/) for making β€˜ggplot’ graphics interactive.

-   [**plotly**](https://plotly.com/r/), R library for plotting interactive statistical graphs.

-   [**DT**](https://rstudio.github.io/DT/) provides an R interface to the JavaScript library [DataTables](https://datatables.net/) that create interactive table on html page.

-   [**tidyverse**](https://www.tidyverse.org/), a family of modern R packages specially designed to support data science, analysis and communication task including creating static statistical graphs.

-   [**patchwork**](https://patchwork.data-imaginist.com/) for combining multiple ggplot2 graphs into one figure.
:::

```{r}
pacman::p_load(ggiraph, plotly, 
               patchwork, DT, tidyverse,
               kableExtra) 
```

### 1.2 Data Set

::: callout-note
This section is taken from [Hands-on_Ex02](Hands-on_Ex/Hands-on_Ex01/Hands-on_Ex01.html) as we are using the same data set.
:::

The data set, *Exam_data.csv,* contains the Year-end examination grades of a cohort of primary 3 students from a local school, and is uploaded as **exam_data**.

#### 1.2.1 Importing exam_data

::: {.codebox .code data-latex="code"}
In the code chunk below, read_csv() of readr package is used to import Exam_data.csv data file into R and save it as an tibble data frame called exam_data.
:::

```{r}
exam_data <- read_csv("data/Exam_data.csv")
```

#### 1.2.2 Summary Statistic of exam_data

::: panel-tabset
## Data Table

### Displaying the first 5 rows of exam_data using `head():`

```{r}
head(exam_data,5) %>%
  kbl() %>%
  kable_material()
```

## Data Structure

### Checking the structure of exam_data using `str():`

```{r}
str(exam_data)
```

::: callout-note
There are 322 rows and 7 variables. The output reveals that the variables have been assigned their correct data types.
:::

## Data Health

### Checking for any symptoms of messy data:

#### 1. Checking for duplicates:

```{r}
exam_data[duplicated(exam_data),]
```

::: callout-note
There were no duplicated rows found in exam_data.
:::

#### 2. Checking missing values:

```{r}
sum(is.na(exam_data))
```

::: callout-note
There were no missing values found in exam_data.
:::

#### 3. Checking for String inconsistencies:

3.1 In CLASS

```{r}
unique(exam_data$CLASS)
```

3.2 In GENDER

```{r}
unique(exam_data$GENDER)
```

3.3 In RACE

```{r}
unique(exam_data$RACE)
```

::: callout-note
There were no string inconsistencies found in exam_data.
:::

#### 4 Checking for Data Irregularities:

4.1 In ENGLISH

```{r}
summary(exam_data$ENGLISH)
```

4.2 In MATHS

```{r}
summary(exam_data$MATHS)
```

4.3 In SCIENCE

```{r}
summary(exam_data$SCIENCE)
```

::: callout-note
There were no data irregularities found in exam_data.
:::
:::

# 2: Hands-on Exercise

## 2.1 Interactive Data Visualisation - ggiraph methods

::: {.kambox .kam data-latex="kam"}
#### What did Prof Kam say?

[ggiraph](https://davidgohel.github.io/ggiraph/) ![](https://r4va.netlify.app/chap03/img/image1.jpg){width="43"} is an html widget and a ggplot2 extension. It allows ggplot graphics to be interactive.

Interactive is made with [**ggplot geometries**](https://davidgohel.github.io/ggiraph/reference/#section-interactive-geometries) that can understand three arguments:

-   **Tooltip**: a column of data-sets that contain tooltips to be displayed when the mouse is over elements.

-   **Onclick**: a column of data-sets that contain a JavaScript function to be executed when elements are clicked.

-   **Data_id**: a column of data-sets that contain an id to be associated with elements.

If it used within a shiny application, elements associated with an id (data_id) can be selected and manipulated on client and server sides. Refer to this [article](https://davidgohel.github.io/ggiraph/articles/offcran/shiny.html) for more detail explanation.
:::

### 2.1.2 Tooltip

::: panel-tabset
## Tooltip effect with tooltip aesthetic

::: {.codebox .code data-latex="code"}
Below shows a typical code chunk to plot an interactive statistical graph by using ggiraph package. Notice that the code chunk consists of two parts. First, an interactive version of ggplot2 geom (i.e. geom_dotplot_interactive()) will be used to create the basic graph. Then, girafe() will be used to generate an svg object to be displayed on an html page.
:::

```{r}
#| warning: false
p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = ID),
    stackgroups = TRUE, 
    binwidth = 1, 
    method = "histodot") +
  scale_y_continuous(NULL, 
                     breaks = NULL)
girafe(
  ggobj = p,
  width_svg = 6,
  height_svg = 6*0.618
)
```

## Displaying multiple information on tooltip

::: {.codebox .code data-latex="code"}
The content of the tooltip can be customised by including a list object as shown in the code chunk below. The first three lines of codes in the code chunk create a new field called tooltip. At the same time, it populates text in ID and CLASS fields into the newly created field. Next, this newly created field is used as tooltip field as shown in the code of line 7.
:::

```{r}
#| warning: false
exam_data$tooltip <- c(paste0(     
  "Name = ", exam_data$ID,         
  "\n Class = ", exam_data$CLASS)) 

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = exam_data$tooltip), 
    stackgroups = TRUE,
    binwidth = 1,
    method = "histodot") +
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(
  ggobj = p,
  width_svg = 8,
  height_svg = 8*0.618
)
```

## Displaying multiple information on tooltip

::: {.codebox .code data-latex="code"}
The content of the tooltip can be customised by including a list object as shown in the code chunk below. The first three lines of codes in the code chunk create a new field called tooltip. At the same time, it populates text in ID and CLASS fields into the newly created field. Next, this newly created field is used as tooltip field as shown in the code of line 7.
:::

```{r}
#| warning: false
  
exam_data$tooltip <- c(paste0(     
  "Name = ", exam_data$ID,         
  "\n Class = ", exam_data$CLASS)) 

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = exam_data$tooltip), 
    stackgroups = TRUE,
    binwidth = 1,
    method = "histodot") +
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(
  ggobj = p,
  width_svg = 8,
  height_svg = 8*0.618
)
```

## Customising Tooltip style

::: {.codebox .code data-latex="code"}
The code chunk below uses opts_tooltip() of ggiraph to customize tooltip rendering by add css declarations. Notice that the background colour of the tooltip is black and the font colour is white and bold.
:::

```{r}
#| warning: false
  
tooltip_css <- "background-color:white; #<<
font-style:bold; color:black;" #<<

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(tooltip = ID),                   
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(    #<<
    opts_tooltip(    #<<
      css = tooltip_css)) #<<
)
```

## Displaying statistics on tooltip

::: {.codebox .code data-latex="code"}
The code chunk below shows an advanced way to customise tooltip. In this example, a function is used to compute 90% confident interval of the mean. The derived statistics are then displayed in the tooltip.
:::

```{r}
#| warning: false
  
tooltip <- function(y, ymax, accuracy = .01) {
  mean <- scales::number(y, accuracy = accuracy)
  sem <- scales::number(ymax - y, accuracy = accuracy)
  paste("Mean maths scores:", mean, "+/-", sem)
}

gg_point <- ggplot(data=exam_data, 
                   aes(x = RACE),
) +
  stat_summary(aes(y = MATHS, 
                   tooltip = after_stat(  
                     tooltip(y, ymax))),  
    fun.data = "mean_se", 
    geom = GeomInteractiveCol,  
    fill = "light blue"
  ) +
  stat_summary(aes(y = MATHS),
    fun.data = mean_se,
    geom = "errorbar", width = 0.2, size = 0.2
  )

girafe(ggobj = gg_point,
       width_svg = 8,
       height_svg = 8*0.618)
```
:::

### 2.1.3 Hover

::: panel-tabset
## Hover effect with *data_id* aesthetic

::: {.codebox .code data-latex="code"}
Code chunk below shows the second interactive feature of ggiraph, namely data_id. Elements associated with a *data_id* (i.e CLASS) will be highlighted upon mouse over.
:::

```{r}
#| warning: false
p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(           
    aes(data_id = CLASS),             
    stackgroups = TRUE,               
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618                      
)        
```

## Styling hover effect

::: {.codebox .code data-latex="code"}
In the code chunk below, css codes are used to change the highlighting effect.Elements associated with a *data_id* (i.e CLASS) will be highlighted upon mouse over.
:::

```{r}
#| warning: false
p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(data_id = CLASS),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(                        
    opts_hover(css = "fill: #202020;"),  
    opts_hover_inv(css = "opacity:0.2;") 
  )                                        
)   
```

::: {.cautionbox .caution data-latex="caution"}
Note: Different from previous example, in this example the ccs customisation request are encoded directly.
:::
:::

### 2.1.4 Tooltip + Hover

::: panel-tabset
## Combining tooltip and hover effect

::: {.codebox .code data-latex="code"}
Elements associated with a data_id (i.e CLASS) will be highlighted upon mouse over. At the same time, the tooltip will show the CLASS.
:::

```{r}
#| warning: false
  
p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(tooltip = CLASS, 
        data_id = CLASS),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618,
  options = list(                        
    opts_hover(css = "fill: #202020;"),  
    opts_hover_inv(css = "opacity:0.2;") 
  )                                        
)      
```
:::

### 2.1.5 Click effect

::: panel-tabset
## Click effect with onclick

::: {.codebox .code data-latex="code"}
onclick argument of ggiraph provides hotlink interactivity on the web. Web document link with a data object will be displayed on the web browser upon mouse click.
:::

```{r}
#| warning: false
  
exam_data$onclick <- sprintf("window.open(\"%s%s\")",
"https://www.moe.gov.sg/schoolfinder?journey=Primary%20school",
as.character(exam_data$ID))

p <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(onclick = onclick),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618) 
```

::: {.cautionbox .caution data-latex="caution"}
Note that click actions must be a string column in the dataset containing valid javascript instructions.
:::
:::

### 2.1.6 Coordinated Multiple Views

::: panel-tabset
## Coordinated Multiple Views with ggiraph

::: {.codebox .code data-latex="code"}
In order to build a coordinated multiple views as shown in the example below, the following programming strategy will be used:

Appropriate interactive functions of ggiraph will be used to create the multiple views. patchwork function of patchwork package will be used inside girafe function to create the interactive coordinated multiple views.

Notice that when a data point of one of the dotplot is selected, the corresponding data point ID on the second data visualisation will be highlighted too.

The *data_id* aesthetic is critical to link observations between plots and the tooltip aesthetic is optional but nice to have when mouse over a point.
:::

```{r}
#| warning: false
  
p1 <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(data_id = ID),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +  
  coord_cartesian(xlim=c(0,100)) + 
  scale_y_continuous(NULL,               
                     breaks = NULL)

p2 <- ggplot(data=exam_data, 
       aes(x = ENGLISH)) +
  geom_dotplot_interactive(              
    aes(data_id = ID),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") + 
  coord_cartesian(xlim=c(0,100)) + 
  scale_y_continuous(NULL,               
                     breaks = NULL)

girafe(code = print(p1 + p2), 
       width_svg = 6,
       height_svg = 3,
       options = list(
         opts_hover(css = "fill: #202020;"),
         opts_hover_inv(css = "opacity:0.2;")
         )
       )  
```
:::

## 2.2 **Interactive Data Visualisation - plotly methods!**

::: {.kambox .kam data-latex="kam"}
#### What did Prof Kam say?

Plotly’s R graphing library create interactive web graphics from ggplot2 graphs and/or a custom interface to the (MIT-licensed) JavaScript library plotly.js inspired by the grammar of graphics. Different from other plotly platform, plot.R is free and open source.

There are two ways to create interactive graph by using plotly, they are:

-   by using plot_ly(),
-   and by using ggplotly()
:::

### 2.2.1 Creating an interactive scatter plot: plot_ly() method

::: panel-tabset
## The plot

```{r}
#| echo: false 
#| warning: false
plot_ly(data = exam_data, 
             x = ~MATHS, 
             y = ~ENGLISH)
```

## The code

```{r}
#| warning: false 
#| code-fold: show
#| eval: false
plot_ly(data = exam_data, 
             x = ~MATHS, 
             y = ~ENGLISH)
```
:::

### 2.2.2 Working with visual variable: plot_ly() method

::: {.codebox .code data-latex="code"}
In the code chunk below, *color* argument is mapped to a qualitative visual variable (i.e. RACE).
:::

::: panel-tabset
## The plot

```{r}
#| echo: false 
#| warning: false
plot_ly(data = exam_data, 
        x = ~ENGLISH, 
        y = ~MATHS, 
        color = ~RACE)
```

## The code

```{r}
#| warning: false 
#| code-fold: show
#| eval: false
plot_ly(data = exam_data, 
        x = ~ENGLISH, 
        y = ~MATHS, 
        color = ~RACE)
```
:::

### 2.2.3 Coordinated Multiple Views with plotly

::: {.codebox .code data-latex="code"}
The creation of a coordinated linked plot by using plotly involves three steps:

-   `highlight_key()` of plotly package is used as shared data.

-   two scatterplots will be created by using ggplot2 functions.

-   lastly, `subplot()` of plotly package is used to place them next to each other side-by-side.

Thing to learn from the code chunk:

-   `highlight_key()` simply creates an object of class [crosstalk::SharedData](https://rdrr.io/cran/crosstalk/man/SharedData.html).

-   Visit this [link](https://rstudio.github.io/crosstalk/) to learn more about crosstalk,
:::

::: panel-tabset
## The plot

```{r}
#| echo: false 
#| warning: false
d <- highlight_key(exam_data)
p1 <- ggplot(data=d, 
            aes(x = MATHS,
                y = ENGLISH)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

p2 <- ggplot(data=d, 
            aes(x = MATHS,
                y = SCIENCE)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))
subplot(ggplotly(p1),
        ggplotly(p2))
```

## The code

```{r}
#| warning: false 
#| code-fold: show
#| eval: false
d <- highlight_key(exam_data)
p1 <- ggplot(data=d, 
            aes(x = MATHS,
                y = ENGLISH)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

p2 <- ggplot(data=d, 
            aes(x = MATHS,
                y = SCIENCE)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))
subplot(ggplotly(p1),
        ggplotly(p2))
```
:::

## 2.3 **Interactive Data Visualisation - crosstalk methods!**

::: {.kambox .kam data-latex="kam"}
#### What did Prof Kam say?

Crosstalk is an add-on to the htmlwidgets package. It extends htmlwidgets with a set of classes, functions, and conventions for implementing cross-widget interactions (currently, linked brushing and filtering).
:::

### 2.3.1 Interactive Data Table: DT package

-   A wrapper of the JavaScript Library DataTables

-   Data objects in R can be rendered as HTML tables using the JavaScript library β€˜DataTables’ (typically via R Markdown or Shiny).

```{r}
DT::datatable(exam_data, class= "compact")
```

### 2.3.2 Linked brushing: crosstalk method

::: {.codebox .code data-latex="code"}
Things to learn from the code chunk:

-   *highlight()* is a function of **plotly** package. It sets a variety of options for brushing (i.e., highlighting) multiple plots. These options are primarily designed for linking multiple plotly graphs, and may not behave as expected when linking plotly to another htmlwidget package via crosstalk. In some cases, other htmlwidgets will respect these options, such as persistent selection in leaflet.

-   *bscols()* is a helper function of **crosstalk** package. It makes it easy to put HTML elements side by side. It can be called directly from the console but is especially designed to work in an R Markdown document. **Warning:** This will bring in all of Bootstrap!.
:::

::: panel-tabset
## The plot

```{r}
#| echo: false 
#| warning: false
d <- highlight_key(exam_data) 
p <- ggplot(d, 
            aes(ENGLISH, 
                MATHS)) + 
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

gg <- highlight(ggplotly(p),        
                "plotly_selected")  

crosstalk::bscols(gg,               
                  DT::datatable(d), 
                  widths = 5)   
```

## The code

```{r}
#| warning: false 
#| code-fold: show
#| eval: false
d <- highlight_key(exam_data) 
p <- ggplot(d, 
            aes(ENGLISH, 
                MATHS)) + 
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

gg <- highlight(ggplotly(p),        
                "plotly_selected")  

crosstalk::bscols(gg,               
                  DT::datatable(d), 
                  widths = 5)        
```
:::