When people ask me “How do I learn R”, I always point them towards the excellent R for Data Science book. It’s freely available and I love the order the chapters, starting with visualisation and tidy data before delving into the details of programming with R.
I had a go at the week 4 submission on Global Mortality data.
I decided to attempt to make some sparklines graphics. I used code from Dr Lukasz Piwek and his Tufte in R project. It’s a great project and I hope I get chance to play with some of the other plots in the future.
I started off considering death by Cardiovascular diseases.
I wrote some code which identified the 5 countries with the largest increase in share of death by cardiovascular diseases, and also the 5 countries with the largest decrease. These 10 countries were then plotted as a sparkline, with the minimum and maximum values highlighted.
I then wrapped this code as a function, and generated plots for all possible causes of death to look for interesting findings. Full code is available on Github.
As a quick plot, I’m fairly happy with this. However, as always there are many ways to improve.
- Looping over all possible causes of death generates some weird plots, especially when the percentages are small.
- The Sparkline plots should probably be more information rich and have less white space to make Tufte happy.
- My function to create the sparkline plots should probably be chunked into smaller functions
Let me know what you think on Twitter. Suggestions/pull requests welcome!