machine learning

The Split-Apply-Combine Technique for Machine Learning with R

Introduction Much discussion in the R community has revolved around the proper way to implement the “split-apply-combine”. In particular, I love the exploration of this topic in this blog post. It seems that the “preferred” approach is dplyr::group_by() + tidyr::nest() for splitting, dplyr::mutate() + purrr::map() for applying, and tidyr::unnest() for combining. Additionally, many in the community have shown implementations of the “many models” approach in {tidyverse}-style pipelines, often also using the {broom} package.

The 5 MOOCs That Helped Me Improve my Data Skills

Massive Online Open Courses (MOOCs) are wonderful resources for people like me who have a passion to learn. In a nutshell, they are college-level courses published by reputable universities and/or instructors for free. Some of the most well-known platforms include Udacity, Coursera, and edX, but you can also find many courses and related materials directly on university websites or other sites such as GitHub. I took advantage of several freely available resources, including MOOCs, while developing my skills as a wannabe data analyst.