Hands-on Exercise 4a

Lesson 4a: Visualising Distribution

Author

Victoria Neo

Published

February 6, 2024

Modified

February 6, 2024

*Taken from* Visualizing Distributions with Raincloud Plots (and How to Create Them with ggplot2)

Overview Summary

Work done	Hands-on Exercise 4a
Hours taken	⏱️ (hospitalisation leave) \| \|
Questions	0
How do I feel?	🫡
What do I think?	Hands-on Exercise 4a was okay as I had gone through it before for Take-home Exercise 1 and 2. I quite like ridgeline and raincloud plots as they are so pretty. Notes They do not look good with when flipped e.g. the ridgeline/raincloud plots are vertical instead of horizontal. Ridgeline plots do not allow geom_text. You need to annotate e.g. p1 + annotate(‘text’, …)

1 Overview Notes

Visualising distribution is not new in statistical analysis. In chapter 1 (Hands-on Exercise 1) we have shared with you some of the popular statistical graphics methods for visualising distribution are histogram, probability density curve (pdf), boxplot, notch plot and violin plot and how they can be created by using ggplot2. In this chapter, we are going to share with you two relatively new statistical graphic methods for visualising distribution, namely ridgeline plot and raincloud plot by using ggplot2 and its extensions.

2 Getting Started

2: Data

2.1 Installing and loading the required libraries

The code chunk below uses p_load() of pacman package to check if the following R packages are installed in the computer. If they are, then they will be launched into R.

tidyverse, a family of R packages for data science process,
ggridges, a ggplot2 extension specially designed for plotting ridgeline plots, and
ggdist for visualising distribution and uncertainty.

code block

pacman::p_load(ggdist, ggridges, ggthemes,
               colorspace, tidyverse)

2.2 Data Set

Note

This section is taken from Hands-on_Ex02 as we are using the same data set.

The data set, Exam_data.csv, contains the Year-end examination grades of a cohort of primary 3 students from a local school, and is uploaded as exam_data.

2.2.1 Importing exam_data

In the code chunk below, read_csv() of readr package is used to import Exam_data.csv data file into R and save it as an tibble data frame called exam_data.

code block

exam <- read_csv("data/Exam_data.csv")

3: Hands-on Exercise

3.1 Visualising Distribution with Ridgeline Plot

What did Prof Kam say?

Ridgeline plot (sometimes called Joyplot) is a data visualisation technique for revealing the distribution of a numeric value for several groups. Distribution can be represented using histograms or density plots, all aligned to the same horizontal scale and presented with a slight overlap.

Note

Ridgeline plots make sense when the number of group to represent is medium to high, and thus a classic window separation would take to much space. Indeed, the fact that groups overlap each other allows to use space more efficiently. If you have less than 5 groups, dealing with other distribution plots is probably better.
It works well when there is a clear pattern in the result, like if there is an obvious ranking in groups. Otherwise group will tend to overlap each other, leading to a messy plot not providing any insight.

3.1.1 Plotting ridgeline graph: ggridges method

ggridges package provides two main geom to plot gridgeline plots, they are: geom_ridgeline() and geom_density_ridges(). The former takes height values directly to draw the ridgelines, and the latter first estimates data densities and then draws those using ridgelines.

The ridgeline plot below is plotted by using geom_density_ridges().

The plot
The code chunk

code block

ggplot(exam, 
       aes(x = ENGLISH, 
           y = CLASS)) +
  geom_density_ridges(
    scale = 3,
    rel_min_height = 0.01,
    bandwidth = 3.4,
    fill = lighten("#7097BB", .3),
    color = "white"
  ) +
  scale_x_continuous(
    name = "English grades",
    expand = c(0, 0)
    ) +
  scale_y_discrete(name = NULL, expand = expansion(add = c(0.2, 2.6))) +
  theme_ridges()

3.1.2 Varying fill colors along the x axis

Sometimes we would like to have the area under a ridgeline not filled with a single solid color but rather with colors that vary in some form along the x axis. This effect can be achieved by using either geom_ridgeline_gradient() or geom_density_ridges_gradient(). Both geoms work just like geom_ridgeline() and geom_density_ridges(), except that they allow for varying fill colors. However, they do not allow for alpha transparency in the fill. For technical reasons, we can have changing fill colors or transparency but not both.

The plot
The code chunk

code block

ggplot(exam, 
       aes(x = ENGLISH, 
           y = CLASS,
           fill = stat(x))) +
  geom_density_ridges_gradient(
    scale = 3,
    rel_min_height = 0.01) +
  scale_fill_viridis_c(name = "Temp. [F]",
                       option = "C") +
  scale_x_continuous(
    name = "English grades",
    expand = c(0, 0)
  ) +
  scale_y_discrete(name = NULL, expand = expansion(add = c(0.2, 2.6))) +
  theme_ridges()

3.1.3 Mapping the probabilities directly onto colour

Beside providing additional geom objects to support the need to plot ridgeline plot, ggridges package also provides a stat function called stat_density_ridges() that replaces stat_density() of ggplot2.

Figure below is plotted by mapping the probabilities calculated by using stat(ecdf) which represent the empirical cumulative density function for the distribution of English score.

The plot
The code chunk

code block

ggplot(exam,
       aes(x = ENGLISH, 
           y = CLASS, 
           fill = 0.5 - abs(0.5-stat(ecdf)))) +
  stat_density_ridges(geom = "density_ridges_gradient", 
                      calc_ecdf = TRUE) +
  scale_fill_viridis_c(name = "Tail probability",
                       direction = -1) +
  theme_ridges()

What did Prof Kam say?

It is important include the argument calc_ecdf = TRUE in stat_density_ridges().

3.1.4 Ridgeline plots with quantile lines

By using geom_density_ridges_gradient(), we can colour the ridgeline plot by quantile, via the calculated stat(quantile) aesthetic as shown in the figure below.

The plot
The code chunk

code block

ggplot(exam,
       aes(x = ENGLISH, 
           y = CLASS, 
           fill = factor(stat(quantile))
           )) +
  stat_density_ridges(
    geom = "density_ridges_gradient",
    calc_ecdf = TRUE, 
    quantiles = 4,
    quantile_lines = TRUE) +
  scale_fill_viridis_d(name = "Quartiles") +
  theme_ridges()

Instead of using number to define the quantiles, we can also specify quantiles by cut points such as 2.5% and 97.5% tails to colour the ridgeline plot as shown in the figure below.

The plot
The code chunk

code block

ggplot(exam,
       aes(x = ENGLISH, 
           y = CLASS, 
           fill = factor(stat(quantile))
           )) +
  stat_density_ridges(
    geom = "density_ridges_gradient",
    calc_ecdf = TRUE, 
    quantiles = c(0.025, 0.975)
    ) +
  scale_fill_manual(
    name = "Probability",
    values = c("#FF0000A0", "#A0A0A0A0", "#0000FFA0"),
    labels = c("(0, 0.025]", "(0.025, 0.975]", "(0.975, 1]")
  ) +
  theme_ridges()

3.2 Visualising Distribution with Raincloud Plot

What did Prof Kam say?

Raincloud Plot is a data visualisation techniques that produces a half-density to a distribution plot. It gets the name because the density plot is in the shape of a “raincloud”. The raincloud (half-density) plot enhances the traditional box-plot by highlighting multiple modalities (an indicator that groups may exist). The boxplot does not show where densities are clustered, but the raincloud plot does!

In this section, you will learn how to create a raincloud plot to visualise the distribution of English score by race. It will be created by using functions provided by ggdist and ggplot2 packages.

3.2.1 Plotting a Half Eye graph

First, we will plot a Half-Eye graph by using stat_halfeye() of ggdist package.

This produces a Half Eye visualization, which is contains a half-density and a slab-interval.

The plot
The code chunk

code block

ggplot(exam, 
       aes(x = RACE, 
           y = ENGLISH)) +
  stat_halfeye(adjust = 0.5,
               justification = -0.2,
               .width = 0,
               point_colour = NA)

Warning!

Things to learn from the code chunk above

We remove the slab interval by setting .width = 0 and point_colour = NA.

3.2.2 Adding the boxplot with `geom_boxplot()`

Next, we will add the second geometry layer using geom_boxplot() of ggplot2. This produces a narrow boxplot. We reduce the width and adjust the opacity.

The plot
The code chunk

code block

ggplot(exam, 
       aes(x = RACE, 
           y = ENGLISH)) +
  stat_halfeye(adjust = 0.5,
               justification = -0.2,
               .width = 0,
               point_colour = NA) +
  geom_boxplot(width = .20,
               outlier.shape = NA)

3.2.3 Adding the Dot Plots with `stat_dots()`

Next, we will add the third geometry layer using stat_dots() of ggdist package. This produces a half-dotplot, which is similar to a histogram that indicates the number of samples (number of dots) in each bin. We select side = “left” to indicate we want it on the left-hand side.

The plot
The code chunk

code block

ggplot(exam, 
       aes(x = RACE, 
           y = ENGLISH)) +
  stat_halfeye(adjust = 0.5,
               justification = -0.2,
               .width = 0,
               point_colour = NA) +
  geom_boxplot(width = .20,
               outlier.shape = NA) +
  stat_dots(side = "left", 
            justification = 1.2, 
            binwidth = .5,
            dotsize = 2)

3.2.4 Finishing touch

Lastly, coord_flip() of ggplot2 package will be used to flip the raincloud chart horizontally to give it the raincloud appearance. At the same time, theme_economist() of ggthemes package is used to give the raincloud chart a professional publishing standard look.

The plot
The code chunk

code block

ggplot(exam, 
       aes(x = RACE, 
           y = ENGLISH)) +
  stat_halfeye(adjust = 0.5,
               justification = -0.2,
               .width = 0,
               point_colour = NA) +
  geom_boxplot(width = .20,
               outlier.shape = NA) +
  stat_dots(side = "left", 
            justification = 1.2, 
            binwidth = .5,
            dotsize = 1.5) +
  coord_flip() +
  theme_economist()

--- title: "Hands-on Exercise 4a" subtitle: "Lesson 4a: [Visualising Distribution](https://r4va.netlify.app/chap09)" author: "Victoria Neo" date: 02/6/2024 date-modified: last-modified format: html: code-fold: true code-summary: "code block" code-tools: true code-copy: true execute: warning: false --- ![*Taken from* [Visualizing Distributions with Raincloud Plots (and How to Create Them with ggplot2)](https://www.cedricscherer.com/2021/06/06/visualizing-distributions-with-raincloud-plots-and-how-to-create-them-with-ggplot2/)](images/clipboard-1510941292.png){fig-align="left" width="354" height="222"} # Overview Summary +------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Work done | Hands-on Exercise 4a | +------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Hours taken | ⏱️ (hospitalisation leave) \| | +------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Questions | 0 | +------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | How do I feel? | 🫡 | +------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ | What do I think? | Hands-on Exercise 4a was okay as I had gone through it before for Take-home Exercise 1 and 2. I quite like ridgeline and raincloud plots as they are so pretty. | | | | | | **Notes** | | | | | | - They do not look good with when flipped e.g. the ridgeline/raincloud plots are vertical instead of horizontal. | | | | | | - Ridgeline plots do not allow geom_text. You need to annotate e.g. p1 + annotate('text', ...) | +------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------+ # 1 Overview Notes Visualising distribution is not new in statistical analysis. In chapter 1 ([Hands-on Exercise 1](https://iss608-vaa-victorianeo.netlify.app/hands-on_ex/hands-on_ex01/hands-on_ex01)) we have shared with you some of the popular statistical graphics methods for visualising distribution are histogram, probability density curve (pdf), boxplot, notch plot and violin plot and how they can be created by using ggplot2. In this chapter, we are going to share with you two relatively new statistical graphic methods for visualising distribution, namely ridgeline plot and raincloud plot by using ggplot2 and its extensions. # 2 Getting Started ## 2: Data ### 2.1 Installing and loading the required libraries The code chunk below uses p_load() of pacman package to check if the following R packages are installed in the computer. If they are, then they will be launched into R. - tidyverse, a family of R packages for data science process, - ggridges, a ggplot2 extension specially designed for plotting ridgeline plots, and - ggdist for visualising distribution and uncertainty. ```{r} pacman::p_load(ggdist, ggridges, ggthemes, colorspace, tidyverse) ``` ### 2.2 Data Set ::: callout-note This section is taken from [Hands-on_Ex02](Hands-on_Ex/Hands-on_Ex01/Hands-on_Ex01.html) as we are using the same data set. ::: The data set, *Exam_data.csv,* contains the Year-end examination grades of a cohort of primary 3 students from a local school, and is uploaded as **exam_data**. #### 2.2.1 Importing exam_data In the code chunk below, read_csv() of readr package is used to import Exam_data.csv data file into R and save it as an tibble data frame called exam_data. ```{r} exam <- read_csv("data/Exam_data.csv") ``` # 3: Hands-on Exercise ## 3.1 Visualising Distribution with Ridgeline Plot ::: {.kambox .kam data-latex="kam"} #### What did Prof Kam say? *Ridgeline plot* (sometimes called *Joyplot*) is a data visualisation technique for revealing the distribution of a numeric value for several groups. Distribution can be represented using histograms or density plots, all aligned to the same horizontal scale and presented with a slight overlap. **Note** - Ridgeline plots make sense when the number of group to represent is medium to high, and thus a classic window separation would take to much space. Indeed, the fact that groups overlap each other allows to use space more efficiently. If you have less than 5 groups, dealing with other distribution plots is probably better. - It works well when there is a clear pattern in the result, like if there is an obvious ranking in groups. Otherwise group will tend to overlap each other, leading to a messy plot not providing any insight. ::: ### 3.1.1 **Plotting ridgeline graph: ggridges method** ggridges package provides two main geom to plot gridgeline plots, they are: [`geom_ridgeline()`](https://wilkelab.org/ggridges/reference/geom_ridgeline.html) and [`geom_density_ridges()`](https://wilkelab.org/ggridges/reference/geom_density_ridges.html). The former takes height values directly to draw the ridgelines, and the latter first estimates data densities and then draws those using ridgelines. The ridgeline plot below is plotted by using `geom_density_ridges()`. ::: panel-tabset ## The plot ```{r} #| echo: false #| warning: false ggplot(exam, aes(x = ENGLISH, y = CLASS)) + geom_density_ridges( scale = 3, rel_min_height = 0.01, bandwidth = 3.4, fill = lighten("#7097BB", .3), color = "white" ) + scale_x_continuous( name = "English grades", expand = c(0, 0) ) + scale_y_discrete(name = NULL, expand = expansion(add = c(0.2, 2.6))) + theme_ridges() ``` ## The code chunk ```{r} #| warning: false #| code-fold: show #| eval: false ggplot(exam, aes(x = ENGLISH, y = CLASS)) + geom_density_ridges( scale = 3, rel_min_height = 0.01, bandwidth = 3.4, fill = lighten("#7097BB", .3), color = "white" ) + scale_x_continuous( name = "English grades", expand = c(0, 0) ) + scale_y_discrete(name = NULL, expand = expansion(add = c(0.2, 2.6))) + theme_ridges() ``` ::: ### 3.1.2 Varying fill colors along the x axis Sometimes we would like to have the area under a ridgeline not filled with a single solid color but rather with colors that vary in some form along the x axis. This effect can be achieved by using either `geom_ridgeline_gradient()` or `geom_density_ridges_gradient()`. Both geoms work just like `geom_ridgeline()` and `geom_density_ridges()`, except that they allow for varying fill colors. However, they do not allow for alpha transparency in the fill. For technical reasons, we can have changing fill colors or transparency but not both. ::: panel-tabset ## The plot ```{r} #| echo: false #| warning: false ggplot(exam, aes(x = ENGLISH, y = CLASS, fill = stat(x))) + geom_density_ridges_gradient( scale = 3, rel_min_height = 0.01) + scale_fill_viridis_c(name = "Temp. [F]", option = "C") + scale_x_continuous( name = "English grades", expand = c(0, 0) ) + scale_y_discrete(name = NULL, expand = expansion(add = c(0.2, 2.6))) + theme_ridges() ``` ## The code chunk ```{r} #| warning: false #| code-fold: show #| eval: false ggplot(exam, aes(x = ENGLISH, y = CLASS, fill = stat(x))) + geom_density_ridges_gradient( scale = 3, rel_min_height = 0.01) + scale_fill_viridis_c(name = "Temp. [F]", option = "C") + scale_x_continuous( name = "English grades", expand = c(0, 0) ) + scale_y_discrete(name = NULL, expand = expansion(add = c(0.2, 2.6))) + theme_ridges() ``` ::: ### 3.1.3 Mapping the probabilities directly onto colour Beside providing additional geom objects to support the need to plot ridgeline plot, ggridges package also provides a stat function called `stat_density_ridges()` that `replaces stat_density()` of ggplot2. Figure below is plotted by mapping the probabilities calculated by using `stat(ecdf)` which represent the empirical cumulative density function for the distribution of English score. ::: panel-tabset ## The plot ```{r} #| echo: false #| warning: false ggplot(exam, aes(x = ENGLISH, y = CLASS, fill = 0.5 - abs(0.5-stat(ecdf)))) + stat_density_ridges(geom = "density_ridges_gradient", calc_ecdf = TRUE) + scale_fill_viridis_c(name = "Tail probability", direction = -1) + theme_ridges() ``` ## The code chunk ```{r} #| warning: false #| code-fold: show #| eval: false ggplot(exam, aes(x = ENGLISH, y = CLASS, fill = 0.5 - abs(0.5-stat(ecdf)))) + stat_density_ridges(geom = "density_ridges_gradient", calc_ecdf = TRUE) + scale_fill_viridis_c(name = "Tail probability", direction = -1) + theme_ridges() ``` ::: ::: {.kambox .kam data-latex="kam"} #### What did Prof Kam say? It is important include the argument `calc_ecdf = TRUE` in `stat_density_ridges()`. ::: ### 3.1.4 Ridgeline plots with quantile lines By using `geom_density_ridges_gradient()`, we can colour the ridgeline plot by quantile, via the calculated `stat(quantile)` aesthetic as shown in the figure below. ::: panel-tabset ## The plot ```{r} #| echo: false #| warning: false ggplot(exam, aes(x = ENGLISH, y = CLASS, fill = factor(stat(quantile)) )) + stat_density_ridges( geom = "density_ridges_gradient", calc_ecdf = TRUE, quantiles = 4, quantile_lines = TRUE) + scale_fill_viridis_d(name = "Quartiles") + theme_ridges() ``` ## The code chunk ```{r} #| warning: false #| code-fold: show #| eval: false ggplot(exam, aes(x = ENGLISH, y = CLASS, fill = factor(stat(quantile)) )) + stat_density_ridges( geom = "density_ridges_gradient", calc_ecdf = TRUE, quantiles = 4, quantile_lines = TRUE) + scale_fill_viridis_d(name = "Quartiles") + theme_ridges() ``` ::: Instead of using number to define the quantiles, we can also specify quantiles by cut points such as 2.5% and 97.5% tails to colour the ridgeline plot as shown in the figure below. ::: panel-tabset ## The plot ```{r} #| echo: false #| warning: false ggplot(exam, aes(x = ENGLISH, y = CLASS, fill = factor(stat(quantile)) )) + stat_density_ridges( geom = "density_ridges_gradient", calc_ecdf = TRUE, quantiles = c(0.025, 0.975) ) + scale_fill_manual( name = "Probability", values = c("#FF0000A0", "#A0A0A0A0", "#0000FFA0"), labels = c("(0, 0.025]", "(0.025, 0.975]", "(0.975, 1]") ) + theme_ridges() ``` ## The code chunk ```{r} #| warning: false #| code-fold: show #| eval: false ggplot(exam, aes(x = ENGLISH, y = CLASS, fill = factor(stat(quantile)) )) + stat_density_ridges( geom = "density_ridges_gradient", calc_ecdf = TRUE, quantiles = c(0.025, 0.975) ) + scale_fill_manual( name = "Probability", values = c("#FF0000A0", "#A0A0A0A0", "#0000FFA0"), labels = c("(0, 0.025]", "(0.025, 0.975]", "(0.975, 1]") ) + theme_ridges() ``` ::: ## 3.2 Visualising Distribution with Raincloud Plot ::: {.kambox .kam data-latex="kam"} #### What did Prof Kam say? Raincloud Plot is a data visualisation techniques that produces a half-density to a distribution plot. It gets the name because the density plot is in the shape of a “raincloud”. The raincloud (half-density) plot enhances the traditional box-plot by highlighting multiple modalities (an indicator that groups may exist). The boxplot does not show where densities are clustered, but the raincloud plot does! ::: In this section, you will learn how to create a raincloud plot to visualise the distribution of English score by race. It will be created by using functions provided by ggdist and ggplot2 packages. ### 3.2.1 **Plotting a Half Eye graph** First, we will plot a Half-Eye graph by using `stat_halfeye()` of ggdist package. This produces a Half Eye visualization, which is contains a half-density and a slab-interval. ::: panel-tabset ## The plot ```{r} #| echo: false #| warning: false ggplot(exam, aes(x = RACE, y = ENGLISH)) + stat_halfeye(adjust = 0.5, justification = -0.2, .width = 0, point_colour = NA) ``` ## The code chunk ```{r} #| warning: false #| code-fold: show #| eval: false ggplot(exam, aes(x = RACE, y = ENGLISH)) + stat_halfeye(adjust = 0.5, justification = -0.2, .width = 0, point_colour = NA) ``` ::: {.cautionbox .caution data-latex="caution"} #### Warning! Things to learn from the code chunk above We remove the slab interval by setting .width = 0 and point_colour = NA. ::: ::: ### 3.2.2 Adding the boxplot with `geom_boxplot()` Next, we will add the second geometry layer using `geom_boxplot()` of ggplot2. This produces a narrow boxplot. We reduce the width and adjust the opacity. ::: panel-tabset ## The plot ```{r} #| echo: false #| warning: false ggplot(exam, aes(x = RACE, y = ENGLISH)) + stat_halfeye(adjust = 0.5, justification = -0.2, .width = 0, point_colour = NA) + geom_boxplot(width = .20, outlier.shape = NA) ``` ## The code chunk ```{r} #| warning: false #| code-fold: show #| eval: false ggplot(exam, aes(x = RACE, y = ENGLISH)) + stat_halfeye(adjust = 0.5, justification = -0.2, .width = 0, point_colour = NA) + geom_boxplot(width = .20, outlier.shape = NA) ``` ::: ### 3.2.3 Adding the Dot Plots with `stat_dots()` Next, we will add the third geometry layer using `stat_dots()` of ggdist package. This produces a half-dotplot, which is similar to a histogram that indicates the number of samples (number of dots) in each bin. We select side = “left” to indicate we want it on the left-hand side. ::: panel-tabset ## The plot ```{r} #| echo: false #| warning: false ggplot(exam, aes(x = RACE, y = ENGLISH)) + stat_halfeye(adjust = 0.5, justification = -0.2, .width = 0, point_colour = NA) + geom_boxplot(width = .20, outlier.shape = NA) + stat_dots(side = "left", justification = 1.2, binwidth = .5, dotsize = 2) ``` ## The code chunk ```{r} #| warning: false #| code-fold: show #| eval: false ggplot(exam, aes(x = RACE, y = ENGLISH)) + stat_halfeye(adjust = 0.5, justification = -0.2, .width = 0, point_colour = NA) + geom_boxplot(width = .20, outlier.shape = NA) + stat_dots(side = "left", justification = 1.2, binwidth = .5, dotsize = 2) ``` ::: ### 3.2.4 Finishing touch Lastly, `coord_flip()` of ggplot2 package will be used to flip the raincloud chart horizontally to give it the raincloud appearance. At the same time, theme_economist() of ggthemes package is used to give the raincloud chart a professional publishing standard look. ::: panel-tabset ## The plot ```{r} #| echo: false #| warning: false ggplot(exam, aes(x = RACE, y = ENGLISH)) + stat_halfeye(adjust = 0.5, justification = -0.2, .width = 0, point_colour = NA) + geom_boxplot(width = .20, outlier.shape = NA) + stat_dots(side = "left", justification = 1.2, binwidth = .5, dotsize = 1.5) + coord_flip() + theme_economist() ``` ## The code chunk ```{r} #| warning: false #| code-fold: show #| eval: false ggplot(exam, aes(x = RACE, y = ENGLISH)) + stat_halfeye(adjust = 0.5, justification = -0.2, .width = 0, point_colour = NA) + geom_boxplot(width = .20, outlier.shape = NA) + stat_dots(side = "left", justification = 1.2, binwidth = .5, dotsize = 1.5) + coord_flip() + theme_economist() ``` :::

Overview Summary

1 Overview Notes

2 Getting Started

2: Data

2.1 Installing and loading the required libraries

2.2 Data Set

2.2.1 Importing exam_data

3: Hands-on Exercise

3.1 Visualising Distribution with Ridgeline Plot

What did Prof Kam say?

3.1.1 Plotting ridgeline graph: ggridges method

3.1.2 Varying fill colors along the x axis

3.1.3 Mapping the probabilities directly onto colour

What did Prof Kam say?

3.1.4 Ridgeline plots with quantile lines

3.2 Visualising Distribution with Raincloud Plot

What did Prof Kam say?

3.2.1 Plotting a Half Eye graph

Warning!

3.2.2 Adding the boxplot with geom_boxplot()

3.2.3 Adding the Dot Plots with stat_dots()

3.2.4 Finishing touch

3.2.2 Adding the boxplot with `geom_boxplot()`

3.2.3 Adding the Dot Plots with `stat_dots()`