tidytext

Text Parsing and Text Analysis of a Periodic Report (with R)

Some Context Those of you non-academia folk who work in industry (like me) are probably conscious of any/all periodic reports that an independent entity publishes for your company’s industry. For example, in the insurance industry in the United States, the Federal Insurance Office of the U.S. Department of the Treasury publishes several reports on an annual basis discussing the industry at large, like this past year’s Annual Report on the Insurance Industry.

Summarizing rstudio::conf 2019 Summaries with Tidy Text Techniques

UPDATE (2019-07-07): Check out this {usethis} article for a more automated way of doing a pull request. To be honest, I planned on writing a review of this past weekend’s rstudio::conf 2019, but several other people have already done a great job of doing that—just check out Karl Broman’s aggregation of reviews at the bottom of the page here! (More on this in a second.) In short, my thoughts on the whole experience are captured perfectly by Nick Strayer’s tweet the day after the conference ended.

Text Parsing and Analysis of a Periodic Report

Exploration of text similarity across similarly structured documents (annual reports).

NBA Team Twitter Analysis Flexdashboard

I just wrapped up a mini-project that allowed me to do a handful of things I’ve been meaning to do: Try out the {flexdashboard} package, which is supposed to be good for prototypying larger dashboards (perhaps created with {shinydashboard}. Test out my (mostly completed) personal {tetext} package for quick and tidy text analysis. (It implements a handful of the techniques shown by David Robinson and Julia Silge, in their blogs and in their Tidy Text Mining with R book.

A Tidy Text Analysis of R Weekly Posts

I’m always intrigued by data science “meta” analyses or programming/data-science. For example, Matt Dancho’s analysis of renown data scientist David Robinson. David Robinson himself has done some good ones, such as his blog posts for Stack Overflow highlighting the growth of “incredible” growth of python, and the “impressive” growth of R in modern times. With that in mind, I thought it would try to identify if any interesting trends have risen/fallen within the R community in recent years.

A Tidy Text Analysis of My Google Search History

While brainstorming about cool ways to practice text mining with R I came up with the idea of exploring my own Google search history. Then, after googling (ironically) if anyone had done something like this, I stumbled upon Lisa Charlotte’s blog post. Lisa’s post (actually, a series of posts) are from a while back, so her instructions for how to download your personal Google history and the format of the downloads (nowadays, it’s in a .