TOP R LANGUAGE RESOURCES TO IMPROVE YOUR DATA SKILLS
Learn R language basics
To build on those beginner skills, R for Data Science gives readers a firm grounding in basic aspects of data analysis, from import and cleaning to visualizing and modeling. Authors Hadley Wickham and Garrett Grolemund both work at RStudio, Wickham as chief scientist and Grolemund as master instructor. Wickham is well known for his suite of R packages dubbed the “tidyverse,” and this book is designed for those who want to use tidyverse packages such as dplyr and purrr.
I can recommend at least two other general books for expanding a beginner’s knowledge: R for Everyone by Jared P. Lander and Sams Teach Yourself R in 24 Hours by three Mango Solutions consultants. R for Everyone is a smaller volume that’s focused a bit more on statistics, with sections on topics like T-Tests, ANOVA, Poisson regression and survival analysis. Teach Yourself R is the broadest of the three, ranging from discussions of R class systems to the Shiny Web framework. (Disclaimer: I’m writing an R book for publisher Taylor & Francis due out late this year or early 2019.)
If you are interested in using R to manipulate character strings, check out Gaston Sanchez’s free online book Handling Strings With R. An earlier PDF version has been updated to include both base R and the stringr package, and also has sections on regular expressions.
Interactive learning company Datacamp offers a few free classes, although most require a monthly or yearly paid subscription. The platform features an R cloud implementation, so students can do exercises and get immediate feedback to see if their code is correct. The Introduction to R course, estimated to take four hours, is free.
I’ve heard some good things about the R package swirl. This is another interactive option, but on your own system, with several courses to choose from that were designed for the platform.
Stack Overflow has long been a programmers’ go-to source for asking questions; it has an active R community. To search for answers before posting your own query, make sure to use the [r] tag. There are other, more specific R-related tags there, too, such as
RStudio launched its own community, which is geared toward issues surrounding RStudio-created packages and other RStudio software. There’s also a category for general questions. Responses tend be a bit less harsh than at Stack Overflow for newbies making rookie errors.
The R for Data Science Slack community mentioned above is also a good place to ask questions. There are a lot of channels in that Slack, so it helps to read up on what each is for so you know where best to post your query.
While it’s tough to use Twitter to get coding help, it can be a good place to ask questions such a,s “Does anyone know of a package that will…”. Make sure to use the #rstats hash tag. LinkedIn and Google+ also have fairly active R groups where questions are regularly asked and answered.
Visualize your data
My ggplot2 cheat sheet is a sortable table, searchable by tasks like coloring by category or rotating x-axis labels. The cheat sheet article includes downloadable ggplot2 RStudio code snippets, offering ready-to-use, fill-in-the-placeholder code for a variety of ggplot2 tasks.
The R Graph Catalog features lots of graph and other plot examples, easily searchable and each with downloadable code. All are made with ggplot2 based on visualization ideas in Creating More Effective Graphs. Maintained by Joanna Zhao and Jennifer Bryan.
Beautiful Plotting in R: A ggplot2 Cheatsheet by Zev Ross is easy to read with a lot of useful information, from starting with default plots to customizing title, axes, legends; creating multi-panel plots and more. Although a couple of years old now, it still has a lot of useful code.
Top 50 ggplot2 visualizations – Master list (with full R code) by Selva Prabhakaran breaks down plots by data analysis types, such as correlation, deviation, ranking or distribution. This is a good page to bookmark if you’d like to check samples for everything from scatter plots and lollipop charts to waffle charts, time series and maps.
ggplot2 has become an extensible platform, not just a package. There’s a gallery of registered extensions if you’d like to see what additional capabilities are available.
For those who’d like to use base R graphics instead of ggplot2, Nathan Yau of the Flowing Data blog has an excellent tutorial including downloadable code: Getting Started with Charts in R.
If you are interested in mapping with R, I posted a tutorial, Create maps in R in 10 (fairly) easy steps, that covers both static and interactive maps.
And, to search for “html widget” packages that generate interactive graphics, check out the html widget gallery.
Advance your skills
RStudio has hosted dozens of webinars on a wide variety of topics for varied skill levels. On-demand replays are available at the RStudio website’s resource area, including some recordings from the annual rstudio::conf event.
RStudio also has posted a number of PDF cheat sheets for various packages and tasks. All are available for free download.
If you’re at all interested in learning the relatively new tidyverse package purrr, I highly recommend Charlotte Wickham’s purrr tutorial from the 2017 useR! international R user conference or Happy R Users purrr from the 2017 RStudio Conference.
For techniques on how to analyze text, check out Text Mining With R free online book by Julia Silge and David Robinson, authors of the tidytext R package. There’s also a longer, more in-depth version available on Amazon.
As mentioned above, Datacamp is a source for learning R, and not just for beginners. It has a range of course offerings on subjects spanning general R to specifics such as machine learning and time series forecasting. For most classes, you’ll need a paid subscription.
R Markdown makes it easy to combine text and R code as well as output to multiple formats such as HTML, PDF and Word. The RStudio R Markdown website features tutorials and a gallery of outputs and formats. In addition, R for Data Science has a fairly extensive chapter on R Markdown formats.
And, if you’re interested in using TensorFlow with R, it would be well worth your time to watch J.J. Allaire’s keynote about TensorFlow at the 2018 RStudio conference. In that talk, he recommended his book Deep Learning with R, co-authored with Francois Chollet, for those interested in diving in with R (and not interested in reading about TensorFlow’s high-level mathematical concepts). RStudio also has a section of its site devoted to TensorFlow for R.
Keep up with new developments
I often tweet important and interesting news about R. If you’re on Twitter, you can follow me at @sharon000. RStudio’s Mara Averick @dataandmeand Microsoft’s David Smith @revodavid are two other accounts worth following for R news.
Want to find incredibly useful functions and packages? I periodically update these two lists: Great R packages for data import, wrangling & visualizationand Useful R functions you might not know.
For short video tips and tricks on things like RStudio code snippets and dplyr’s case_when() function, check out my Do More With R screencast series.
The Revolutions blog, now part of Microsoft, keeps tabs on a wide variety of R technical updates.
R Weekly is a community effort to round up interesting uses of R as well as new packages and compelling blog posts.
Package and repo info
CRAN is the official repository for R packages. However, MetaCRAN is a more visually appealing version if you’re trying to search or browse. It includes CRAN “task views,” which compile useful packages for specific fields such as machine learning. MetaCRAN also lets you see CRAN packages with the most stars on GitHub.
With more than 12,500 CRAN R packages, it can be hard to know what packages out there might solve a problem you have, or even remember which packages have what functions. The RDocumentation website lets you search for packages or functions. By DataCamp.
For information about packages with color palettes you can use in R, check out Emil Hvitfeldt’s Comprehensive list of color palettes in R.
Shiny Web framework
If you’d like to learn how to make full-fledged Web apps with R, RStudio’s Shiny framework is one option. The RStudio Shiny site has a number of articles and tutorials, as well as a gallery of examples.
Datacamp offers a free online interactive course, Building Web Applications in R with Shiny, by Mine Cetinkaya-Rundel, associate professor at Duke University & data scientist and educator at RStudio.
To create an interactive dashboard – as opposed to an entire application – there’s RStudio’s shinydashboard project. See the get started article for how to begin. For dashboards with even less code (but also less interactivity), there’s flexdashboard.