Week 7 - Data visualisation
Data visualization with ggplot2¶
We reach out almost the end of this IOC, where you've laid the groundwork with fundamental concepts. Let's spice up your skill set with ggplot2!
ggplot2
is the magic wand for creating cool charts and graphs,
it introduces you to the artistry of crafting meaningful and expressive visualizations.
It's not just about stats; it's about making your data speak visually. So, gear up to add that extra flair to your reports and impress your audience with data storytelling!
Unlock the power of data visualization by using the ggplot2 page in the reference manual.
Let's Practice¶
For each week, you'll have a set of exercises that you must render in an R script.
After that you need to complete the following google form to answer some MCQ (Multiple
Choice Questions) where the final question is to deposit your R script.
Please note that an Rscript has the extension .R
but it's not supported by Google Form.
To avoid this inconvenience, you need to add the .txt
extension to make your file named as: NAME_week7_script.R.txt
.
Do it yourself!
Basic exercises with diamonds
¶
Let's play with the dataset diamonds
provided in the ggplot2
package,
it contains prices of more than 50,000 round cut diamonds, with 10 variables.
Use ?diamonds
to get the full description and str(diamonds)
to have a glimpse of the data structure.
- 1. Create a plot to visualize the
price
and thecarat
, colored by the quality of thecut
. - 2. Change the shape and the size of the points.
- 3. Create a histogram of
price
by the diamonds'color
. - 4. Make the bars in histogram side by side.
- 5. Do the same figure but only for diamonds with prices higher than 10,000$.
- 6. Draw a density plot of prices by group of
clarity
. - 7. Visualize the diamonds'
carat
and width (y
), colored byclarity
and usecolor
as facet. - 8. Add a 2nd facet for the
cut
, make the scales vary across both columns and rows.
Bonus for heatmap¶
- 9. Use the previously built
p_heatmap
from the ggplot2 reference, try to add clustering tree (dendrogram) on the figure.
Hints
- We first need data for dendrogram: think about what you will use to build the dendrogram?
- Then plot the dendrogram: the R package {ggdendro} can help you to draw the dendrogram data as a ggplot with
geom_segment()
- Third, how to add the dendrogram? The R package {patchwork} is simple and useful to combine multiple ggplots (imagine we cut the plane on 4 parts: top-left for the sample-level dendrogram, top-right remains empty, bottom-left for the heatmap, bottom-right for the gene-level dendrogram )
Please be aware of the best practices for your Rscript, we will be attentive to them!
Now you can fill the following quiz: Quiz of week 7.
Thank you for your attention and see you next week