r

Analyzing Professional Sports Team Colors with R

When working with the ggplot2 package, I often find myself playing around with colors for longer than I probably should be. I think that this is because I know that the right color scheme can greatly enhance the information that a plot portrays; and, conversely, choosing an uncomplimentary palette can suppress the message of an otherwise good visualization. With that said, I wanted to take a look at the presence of colors in the sports realm.

NBA Team Twitter Analysis Flexdashboard

I just wrapped up a mini-project that allowed me to do a handful of things I’ve been meaning to do: Try out the {flexdashboard} package, which is supposed to be good for prototypying larger dashboards (perhaps created with {shinydashboard}. Test out my (mostly completed) personal {tetext} package for quick and tidy text analysis. (It implements a handful of the techniques shown by David Robinson and Julia Silge, in their blogs and in their Tidy Text Mining with R book.

A Tidy Text Analysis of R Weekly Posts

I’m always intrigued by data science “meta” analyses or programming/data-science. For example, Matt Dancho’s analysis of renown data scientist David Robinson. David Robinson himself has done some good ones, such as his blog posts for Stack Overflow highlighting the growth of “incredible” growth of python, and the “impressive” growth of R in modern times. With that in mind, I thought it would try to identify if any interesting trends have risen/fallen within the R community in recent years.

Dealing with Interval Data and the nycflights13 package using R, Part 2

In this post, I’ll continue my discussion of working with regularly sampled interval data using R. (See my previous post for some insight regarding minute data.) The discussion here is focused more so on function design. Daily Data When I’ve worked with daily data, I’ve found that the .csv files tend to be much larger than those for data sampled on a minute basis (as a consequence of each file holding data for sub-daily intervals).

Dealing with Interval Data and the nycflights13 package using R

In my job, I often work with data sampled at regular intervals. Samples may range from 5-minute intervals to daily intervals, depending on the specific task. While working with this kind of data is straightforward when its in a database (and I can use SQL), I have been in a couple of situations where the data is spread across .csv files. In these cases, I lean on R to scrape and compile the data.

A Tidy Text Analysis of My Google Search History

While brainstorming about cool ways to practice text mining with R I came up with the idea of exploring my own Google search history. Then, after googling (ironically) if anyone had done something like this, I stumbled upon Lisa Charlotte’s blog post. Lisa’s post (actually, a series of posts) are from a while back, so her instructions for how to download your personal Google history and the format of the downloads (nowadays, it’s in a .

Personal Coding Conventions

As a person who’s worked with various programming languages over time, I have become interested in the nuances and overlaps among languages. In particular, concepts related to code syntax and organization–everything from technical concepts such as lexical scoping, to more broad concepts such as importing and naming data–really fascinate me. Organization “enthusiasts” like me truly appreciate software/applications that follow consistent norms. In the R community, the tidyverse ecosystem has become extremely popular because it implements a consistent, intuitive framework that is, consequently, easy to learn.

Visualizing an NBA Team's Schedule Using R

If you’re not completely new to the data science community (specifically, the #rstats community), then you’ve probably seen a version of the “famous” data science workflow diagram. 1 If one is fairly familiar with a certain topic, then one might not spend much time with the initial “visualize” step of the workflow. Such is the case with me and NBA data–as a relatively knowledgeable NBA follower, I don’t necessarily need to spend much of my time exploring raw NBA data prior to modeling.

(Yet Another) Migration to Blogdown Post

As of today, I’ve officially made the jump to using the R package blogdown (which uses the Hugo static-site generator under the hood) for my personal website. Previously, I had been using WordPress for my blogging purposes. In sync with the change in platform, I’m changing the name of this site from “Number Sense” (www.numbersense.org) to something involving my name (Tony ElHabr). Nonetheless, my original intents to write about math, sports, and data-related things have not changed.