Home

Welcome To My Data Blog

Hi I' am Pascal

Hi I' am Pascal

Thanks for checking out my blog. You can find all kinds of blog posts about R, Python, statistics, and R Shiny on here. Enjoy exploring and feel free to leave comments or message me directly at pascal.sfu.ca.  

 

I also created a website from scratch with Shiny at https://pascal-schmidt-ds.com where you can find my interactive resume and also some posts and personal projects. It is still under construction but will be finalized soon. 

Blog Posts

Rowwise Operations on Data Frames With purrr’s pmap() Function in R

Usually, we are working with columns in R. Finding the mean, max, or median of certain variables is very straight forward. However, when we want to work with rows then we do not instantly know what to do. In this blog post, I will be explaining how to do row-wise operations on data frames in R. When a colleague asked… Read More

My Most Favourite ggplot Plot – Powerful Bar Plot for Presentations

Today I will be talking about my most favorite ggplot bar plot. I am using this one a lot for my work and in presentations. It gives a great overview of proportions and counts. All in one plot without being overwhelming. As always, we will be using the Pokemon data set which can be found here. Loading Data and Forming… Read More

More dplyr – How to Get Your Data Into the Right Shape

Last time we looked at the basic verbs of the tidyverse and this time we will be looking at some more verbs that make data munging/shaping a lot easier. This time we will be covering: select_if()/all() filter_if()/at() mutate_if()/at() between() na_if() case_when() pull() Again, we will be looking at the Pokemon data set. library(tidyverse) poke <- read.csv(“Pokemon.csv”) %>% dplyr::select(-X.) dplyr’s select_if()/all()… Read More

Learning the Tidyverse: Basic dplyr Verbs for Data Manipulation

Data comes in various shapes and forms. Hence, basic data manipulation is a must-have skill as a data scientist. The tidyverse packages are the ideal solution for doing data shaping. These packages have been developed by the core R development team and I would even consider them as part of base R functions. The dplyr package is the most useful… Read More

A Short Tutorial about Magrittr’s Pipe Operator and Placeholders

magrittr’s pipe operator, %>% is one of the most powerful operations in data wrangling and helps you to keep your code: clean and readable maintainable The magrittr pipe works essentially the same way as the + sign in ggplot2. If you need a quick reminder on how the plus sign is used in ggplot2 then here are two tutorials you… Read More

The Grammar of Graphics – Pokemon ggplot Tutorial Part2

Last tutorial, we learned the essential elements of a ggplot2 graphic. That is: Data element Aesthetics element Geometries element These three elements are the essential blocks for a successful ggplot2 graphic. Now in this tutorial, we will be explaining the other four remaining elements: Facets Statistics Coordinates Themes These four elements are not essential to creating a ggplot2 graphic but… Read More

The Grammar Of Graphics – All You Need to Know About ggplot2 and Pokemons

ggplot2 is an R package for producing data visualizations. It is based on the Grammar of Graphics by Leland Wilkinson and is the most used package for producing graphics in R. This tells you that ggplot2 is worth the effort of learning. So let’s get you started with it! ggplot2 consists of the following elements: Essential Elements Data The data… Read More

Doing Data Science Without Programming Knowledge? My Data Science Journey

When you type in “How to become a Data Scientist” into any search engine, the first thing that jumps into your eyes are the requirements in bullet point format. On top of that list, you will always find programming or coding. Almost all people say that it is absolutely vital to know how to program to do data science. I… Read More

Data Science Jobs