Ultimate R Resources: From Beginner to Advanced

May 16, 2020 By Pascal Schmidt personal

Learning how to program in R can be quite daunting. There are so many resources out there that one might not know where to start exactly. Once you have started your journey in data science with R, however, it can be quite fun. Well… quite exciting and fun up to a certain point where it becomes harder and harder to accumulate more knowledge. Getting diminishing returns on your program investments can be a bit frustrating. To solve that problem, I am going to provide some resources that I found helpful when I wanted to learn more about the R language.

A lot of resources online are geared towards beginners. However, to become an advanced programmer, the resources online are a bit more hidden. In this blog post, I will share with you the best R resources, in my opinion, for beginners and people who want to transition from intermediate to advanced.

Best R Resources from Beginner to Intermediate

Data Camp:

  • For the first steps with R, I recommend doing Data Camps free R course. You will get introduced to many concepts such as vectors, matrices, data frames, and lists.
  • After that, I recommend the “Introduction to the Tidyverse” course by David Robinson. You will get introduced to the most basic dplyr verbs such as filter, arrange, mutate, group_by, and summarize. More courses to consider for deepening your skills working with data would be “Data Manipulation with dplyr”, “Data Visualization with ggplot2” parts 1 and 2, and “Intermediate R”.

Data Camp is a really good resource when starting out your R programming journey. However, the courses are very basic and become boring after you get the basics down.

Tutorials:

  • Now, a little bit into your R programming journey, I would recommend doing some tutorials. Personally, I bought Jose Portilla’s course on Udemy and followed along. He discusses some mini-projects that you can follow along. Particularly, I would recommend doing the Titanic competition on Kaggle. There is a ton of material on the internet with people who have done this competition. Study their Kernels (code, reports) and if you get stuck with your own implementations, borrow their code.

Books:

  • The only book worth considering for starting out with R is R for Data Science (available for free). I like that the R community has decided to start out by teaching specific packages first, such as ggplot2, dplyr etc., instead of diving into programming concepts. So, what you’ll learn are basic R workflows and how to use certain packages. I really like this concept because no programming knowledge is needed, and you just learn how to code.

Self-Studying:

  • I am recommending picking data sets you are interested in and doing some exploratory data analysis and some basic model fitting. For the exploratory data analysis, the most important packages are all included in the tidyverse package and you should be able to get everything by knowing the basics of these packages. For modeling purposes, I would recommend becoming good at using the tidymodels. By knowing the tidyverse + tidymodels packages well, you’ll become a data ninja very soon.

Websites where you can get some free data sets:

Also, search for some data sets provided by the government or your local municipality and explore data for your local area. That is always a bit more interesting in my opinion.

Other Resources:

  • Other R resources that might help you are Jenny Bryan’s tutorials and teachings. There is a purrr tutorial and a great data wrangling and data analysis tutorial/book that she designed for the Master of Data Science students at the University of British Columbia.
  • Lastly, it might also be helpful to create a StackOverflow account and ask questions for when you get stuck. Chances are that you have now a solid knowledge of R and might not find solutions to your R data questions on the web.

Best R Resources from Intermediate to Advanced

I think that the biggest challenge to transition from an intermediate R user/programmer to an advanced one is that it takes a lot of time to deepen your knowledge in the above-introduced concepts. The tidyverse and the tidymodels collection of packages are an opinionated collection of packages and, therefore, designed for you to retain concepts well. However, it takes still tons of practice to get an intuition for these packages.

Another big challenge is, you don’t know what you don’t know and the resources on the internet that are new to you become fewer and fewer. Hence, it might be harder to learn concepts and build up your R knowledge. Hence, here are some resources that might help you to become a better R programmer.

Documentation:

Very boring I know but very crucial to deepen your knowledge for certain functions and packages.

  • Become good at reading the documentation on CRAN and make use of vignettes if the authors published them.
  • Make use of ??function_name and read the documentation in the R Studio IDE.

Books:

  • Read Advanced R (for free). Advanced R goes through the basics again but also has a ton of advanced topics that you might not have heard of.
  • Another good book is Efficient R Programming (for free). It gives you ideas to make your code more efficient.

#TidyTuesday:

  • Participate on a weekly basis in #TidyTuesday challenges. Every week, a data set will be uploaded, and participants post their visualizations on Twitter with the hashtag TidyTuesday. Learn from other’s and their code and visualizations.

YouTube:

  • Dave Robinson’s channel is a great channel where he goes over the #TidyTuesday challenge every Tuesday at 5pm PST. Learn from him and how he approaches analyzing a data set. He is a very good R programmer and has written multiple packages and books about R.
  • Julia Silge is another great YouTube channel where she goes as well over #TidyTuesday challenges. However, David Robinson uses an explanatory data analysis approach with the tidyverse, whereas Julia Silge does modeling with the tidymodels packages.
  • There is also a TidyX channel that does code reviews for #TidyTuesday challenges.
  • Another YouTuber who goes through #TidyTuesday data sets and does live coding is Andrew Couch.
  • Lastly, if you are interested in a Python approach that leverages functions similar to R, check out this channel.

 

Twitter:

  • Twitter is a really good resource to follow people that are heavily involved in the R community. This ensures you will always be up to date when it comes to newly developed features or packages. Often, the R community people have their own blogs in their bios which you can visit and learn from.

Other Resources and Ideas:

  • There is an R for Data Science Slack channel that you can join and post questions or answers to topics.
  • Do your own projects with data that you have collected yourself. Become good at working with API’s that provide you with interesting data that you can analyze. Alternatively, get good at rvest and web scrape your own data. If you want to go the extra step, put your data analysis into production with R shiny, and let other users use your model or data visualizations to make informed decisions. Here is a video about R Shiny in production. Here is a link to a great YouTube video about R Shiny in production and here a great resource.
  • Finally, as Rebecca Barter and Dave Robinson recommend: Start your own blog about R and data science and learn along the way.

Post your comment