Fantasy Football and the Classical Scheduling Problem

Introduction Every year I play in several fantasy football (American) leagues. For those who are unaware, it’s a game that occurs every year in sync with the National Football League (NFL) where participants play in weekly head-to-head games as general managers of virtual football teams. (Yes, it’s very silly.) The winner at the end of the season is often not the player with the team that scores the most points; often a fortunate sequence of matchups dictates who comes out on top.

Decomposition and Smoothing with data.table, reticulate, and spatstat

While reading up on modern soccer analytics (I’ve had an itch for soccer and tracking data recently, I stumbled upon an excellent set of tutorials written by Devin Pleuler. In particular, his notebook on non-negative matrix factorization (NNMF) caught my eye. I hadn’t really heard of the concept before, but it turned out to be much less daunting once I realized that it is just another type of matrix decomposition.

S3 Classes and {vctrs} to Create a Soccer Pitch Control Model

Note: This post was update on 2020-09-24 to correct field dimension translations that were previously distorting the pitch control contours. The R analogues now match up much more closely with the python versions after the updates. Intro There’s never been a better time to be involved in sports analytics. There is a wealth of open-sourced data and code (not to mention well-researched and public analysis) to digest and use. Both people working for teams and people just doing at as a hobby are publishing new and interesting analyses every day.

Comparing Variable Importance Functions (For Modeling)

I’ve been doing some machine learning recently, and one thing that keeps popping up is the need to explain the models and their components. There are a variety of ways to go about explaining model features, but probably the most common approach is to use variable (or feature) importance scores. Unfortunately, computing variable importance scores isn’t as straightforward as one might hope—there are a variety of methodologies! Upon implementation, I came to the question “How similar are the variable importance scores calculated using different methodologies?

Generating a Gallery of Visualizations for a Static Website (using R)

While I was browsing the website of fellow R blogger Ryo Nakagawara1, I was intrigued by his “Visualizations” page. The concept of creating an online “portfolio” is not novel 2, but I hadn’t thought to make one as a compilation of my own work (from blog posts)… until now 😄. The code that follows shows how I generated the body of my visualization portfolio page. The task is achieved in a couple of steps.

Making a Cheat Sheet with Rmarkdown

Unfortunately, I haven’t had as much time to make blog posts in the past year or so. I started taking classes as part of Georgia Tech’s Online Master of Science in Analytics (OMSA) program last summer (2018) while continuing to work full-time, so extra time to code and write hasn’t been abundant for me. Anyways, I figured I would share one neat thing I learned as a consequence of taking classes—writing compact “cheat sheets” with {rmarkdown}.

Text Parsing and Text Analysis of a Periodic Report (with R)

Some Context Those of you non-academia folk who work in industry (like me) are probably conscious of any/all periodic reports that an independent entity publishes for your company’s industry. For example, in the insurance industry in the United States, the Federal Insurance Office of the U.S. Department of the Treasury publishes several reports on an annual basis discussing the industry at large, like this past year’s Annual Report on the Insurance Industry.

Summarizing rstudio::conf 2019 Summaries with Tidy Text Techniques

UPDATE (2019-07-07): Check out this {usethis} article for a more automated way of doing a pull request. To be honest, I planned on writing a review of this past weekend’s rstudio::conf 2019, but several other people have already done a great job of doing that—just check out Karl Broman’s aggregation of reviews at the bottom of the page here! (More on this in a second.) In short, my thoughts on the whole experience are captured perfectly by Nick Strayer’s tweet the day after the conference ended.

A Newbie's Guide to Making A Pull Request (for an R package)

I had the wonderful opportunity to participate in the {tidyverse} Developer Day the day after rstudio::conf2019 officially wrapped up. 1 One of the objectives of the event was to encourage open-source contributor newbies (like me 😄) to gain some experience, namely through submitting pull requests to address issues with {tidyverse} packages. Having only ever worked with my own packages/repos before, I found this was to be perfect opportunity to “get my feet wet”!

Re-creating a Voronoi-Style Map with R

Introduction I’ve written some “tutorial”-like content recently—see here, here, and here—but I’ve been lacking on ideas for “original” content since then. With that said, I thought it would to try to re-create something with R. (Not too long ago I saw that Andrew Heiss did something akin to this with Charles Minard’s well-known visualization of Napoleon’s 1812.) The focus of my re-creation here is the price contour map shown on the front page of the website for the Electric Reliability Council of Texas, the independent system operator of electric power flow for about 90 percent of Texas residents, as well as the employer of yours truly).