Technology:
For this project, I used R and mainly the tidyverse
packages such as dplyr
, purrr
, and ggplot2
. I also used the Rcpp
library to seamlessly integrate my C++ function in R.
I also implemented this project in Python and used libraries such as numpy
, scikit-learn
, and pandas
.
This was an in-class Kaggle competition, where I finished first place. The goal was to predict missing values in five columns of a time-series data set. My solution is based on identifying for which missing values in the data set I can use linear interpolation and for which values I have to use a machine-learning algorithm.
- Link to GitHub.