How You Should Keep Score in Pickup Basketball 2024-08-27 TLDR: For full court pickup basketball you should play by 2s and 3s. Make-it-take-it by 1s and 2s is also fine. Adapt your game according to the scoring and possession system in place. Steph Curry is … gt dplyr tidyr ggplot2 purrr forcats scales DT stringr
Untangling Nassim Taleb's Criticism of that Headline Grabbing Intermittent Fasting Study 2024-05-08 TLDR: Parts of Taleb’s critique of an, albeit overhyped study on Intermittent Fasting, seem specious. Veritasium recently put out a video on “The Problem With Science Communication” that details how … ggplot2
Aggregating Measures of Uncertainty 2024-02-11 There are many situations where you want to aggregate values, however if those values are on different scales or are related to measures of uncertainty, it’s typically more complicated than simply …
Odds Are You're Using Probabilities to Describe Event Outcomes 2023-11-03 We grow up learning proportions, percentages, risks, probabilities. You encounter them when a teacher gives a grade on a test or a doctor describes the risk of an illness. On the other hand, we rarely … dplyr ggplot2 DiagrammeR
Converting Between Currencies Using priceR 2022-06-16 In this post I’ll walk through an example of how to convert between currencies. A challenge is that the conversion rate is constantly changing. If you have historical data you’ll want the conversion … priceR dplyr tidyr purrr lubridate devtools
Pulling Twitter Engagements Using the v2 API as Well as rtweet 2022-04-11 This is a follow-up to a short post I wrote on R Access to Twitter’s v2 API. In this post I’ll walk through a few more examples of pulling data from twitter using a mix of Twitter’s v2 API as well as … rjson httr jsonlite dplyr purrr lubridate rtweet tidyr glue rstudioapi fs readr tidyverse
R Access to Twitter's V2 API 2022-04-04 The rtweet package is still the easiest way to GET and POST Twitter data from R. However its developers are currently working on adapting it to the new API. V2 comes with a variety of new features. … rjson httr jsonlite dplyr purrr glue
Network Visualizations of Code Collections (funspotr part 3) 2022-03-17 In previous posts and threads I’ve alluded to the potential utility of visualizing the relationships between parsed functions/packages and files as a network plot. […] It can be helpful to … dplyr funspotr readr
Identifying R Functions & Packages in Github Gists (funspotr part 2) 2022-02-07 This post is part two in a series of posts introducing funspotr. See also: […] This post shows how funspotr can also be applied to parse gists: […] By functions or packages … dplyr purrr stringr funspotr readr DT fs rstudioapi
Identifying R Functions & Packages Used in GitHub Repos (funspotr part 1) 2022-01-18 TLDR: funspotr provides helpers for spotting the functions and packages in R and Rmarkdown files and associated github repositories. See Examples for catalogues of the functions/packages used in posts … dplyr funspotr yaml purrr fs readr here
Predicting NBA Playoff Berths: FiveThirtyEight vs Betting Markets 2021-12-17 TLDR: FiveThirtyEight’s forecasts of NBA playoff berths seem to hold-up OK against betting markets. If you trust them, you should consider betting against the Lakers right now. In The Virtues and … stringr dplyr glue gt rvest janitor lubridate purrr tidyr readr fs broom
Macros in the Shell: Integrating That Spreadsheet From Finance Into a Data Pipeline 2021-05-10 There is many a data science meme degrading excel: (Google Sheets seems to have escaped most of the memes here.) While I no longer use it regularly for the purposes of analysis, I will always have a … dplyr digest mvtnorm purrr readr glue readxl here
Quantile Regression Forests for Prediction Intervals 2021-04-21 In this post I will build prediction intervals using quantile regression, more specifically, quantile regression forests. This is my third post on prediction intervals. Prior posts: […] This … workflows ggplot yardstick gt forcats scales pander
Simulating Prediction Intervals 2021-04-05 Part 1 of my series of posts on building prediction intervals used data held-out from model training to evaluate the characteristics of prediction intervals. In this post I will use hold-out data to … workflows devtools gt ggplot forcats scales pander
Understanding Prediction Intervals 2021-03-18 Prediction intervals provide a measure of uncertainty for predictions on individual observations. This post… […] This is the first of three posts on prediction intervals (Part 2 employs … AmesHousing dplyr rsample recipes gt parsnip workflows ggplot yardstick stringr tidyr forcats scales pander
Basics of Data on People Experiencing Homelessness 2021-01-11 This write-up provides a broad overview of data sources and reports relevant for an independent researcher or analyst new to exploring data on people experiencing homelessness. The section on HMIS …
Weighting Confusion Matrices by Outcomes and Observations 2020-12-08 Weighting in predictive modeling may take multiple forms and occur at different steps in the model building process. […] The focus of this post is on the last stage1. I will describe two types … ggplot dplyr rsample parsnip probably yardstick devtools purrr knitr tidyr
Undersampling Will Change the Base Rates of Your Model's Predictions 2020-11-23 TLDR: In classification problems, under and over sampling1 techniques shift the distribution of predicted probabilities towards the minority class. If your problem requires accurate probabilities you … ggplot dplyr purrr tidyr knitr modelr yardstick
Influencing Distributions with Tiered Incentives 2020-11-02 In this post I will use incentives for sales representatives in pricing to provide examples of factors to consider when attempting to influence an existing distribution. For instance, if you have a … ggplot dplyr purrr forcats
Gambling Where the House Almost Always Loses... but Still Wins 2020-10-28 In this post, I will describe an example of a game that produces many small wins for the player and occasional large wins for the house. Such a game could take advantage of psychological biases of … devtools econocharts
Should You Use an Assignment as Part of Your Hiring Process for a Data Scientist? 2020-10-27 A version of this question was asked on my alumni Slack channel. There were some excellent points brought up by those answering the question in the negative, including that… […] I think each of …
Feature Engineering with Sliding Windows and Lagged Inputs 2020-10-12 The new rsample::sliding_*() functions bring the windowing approaches used in slider to the sampling procedures used in the tidymodels framework1. These functions make evaluation of models with … httr jsonlite dplyr lubridate rsample slider devtools recipes parsnip workflows tune purrr tidyr forcats ggplot broom
A National Popular Vote Weighted by the Electoral College 2020-09-11 TLDR: In this post I discuss using a national popular vote weighted by the electoral college to elect the president. This approach would empower voters by expanding political influence outside of … pins readr dplyr janitor tidyr forcats ggplot
Linear Regression in Pricing Analysis, Essential Things to Know 2020-08-17 Pricing is hard. […] Price is Right Contestant… struggling […] This is particularly true with large complicated products, common in Business to Business sales (B2B). B2B sellers may lack … AmesHousing dplyr equatiomatic broom
Animate interactive objects with Face Detection, JavaScript and Chrome Browser 2020-07-20 We spend the majority of our time in front of screens. It’s mostly one of computer/tablet/phone/tv1. These are largely platforms the user owns or controls. I’m surprised we don’t yet have more … chrome-api javascript
Short Examples of Best Practices When Writing Functions That Call dplyr Verbs 2020-06-25 dplyr, the foundational tidyverse package, makes a trade-off between being easy to code in interactively at the expense of being more difficult to create functions with. The source of the trade-off is … dplyr
Use Flipbooks to Explain Your Code and Thought Process 2020-06-24 Using the pipe operator (%>%) is one of my favorite things about coding in R and the tidyverse. However when it was first shown to me, I couldn’t understand what the #rstats nut describing it was … dplyr tidyr purrr ggplot ggbeeswarm animatrixr emo rlang fs pagedown magick here pdftools officer flair flipbookr
Tidy Pairwise Operations 2020-06-03 In May of 2021 I co-wrote pwiser a package for doing pairwise operations in {dplyr} that provides a much smoother approach than the one I build-up to in this post. […] Say you want to map an … AmesHousing dplyr corrr tidyr stringr purrr forcats ggplot devtools weights
Riddler Solutions: Pedestrian Puzzles 2020-03-04 This post contains solutions to FiveThirtyEight’s two riddles released 2020-02-14, Riddler Express and Riddler Classic. I created a toy package animatrixr to help with some of the visualizations and … tidyr dplyr animatrixr knitr ggplot ggforce purrr forcats
animatrixr & Visualizing Matrix Transformations pt. 2 2020-02-24 This post is a continuation on my post from last week on Visualizing Matrix Transformations with gganimate. Both posts are largely inspired by Grant Sanderson’s beautiful video series The Essence of … devtools dplyr animatrixr
Visualizing Matrix Transformations 2020-02-20 I highly recommend the fantastic video series Essence of Linear Algebra by Grant Sanderson. In this post I’ll walk through how you can use gganimate and the tidyverse to (very loosely) recreate some … dplyr tidyr ggplot ggforce purrr knitr gganimate
Riddler Solutions: Palindrome Dates & Ambiguous Absolute Value Bars 2020-02-13 This post contains solutions to FiveThirtyEight’s two riddles released 2020-02-07, Riddler Express and Riddler Classic. Code for figures and solutions can be found on my github page. […] The … dplyr lubridate stringi knitr purrr stringr tidyr ggplot
Riddler Solutions: Perfect Bowl & Magnetic Volume 2020-02-06 This post contains solutions to FiveThirtyEight’s two riddles released 2020-01-31, Riddler Express and Riddler Classic. Code for figures and solutions can be found on my github page. […] The … dplyr ggplot ggforce
Solar in Seattle 2020-02-01 TLDR: Residential solar installations have gained popularity in the Seattle area over the last few years1. Prima facie, these seem to represent a suboptimal use of panels which could be more …
Iceland Day 6: Perlan & Departure 2020-01-01 I did my best to convince Britney and my parents that we should start the morning with a ‘polar bear plunge’ in the ocean but was unsuccessful in convincing anyone (including myself) to participate. … travel iceland
Iceland Day 5: Blue Lagoon & New Year’s Eve 2019-12-31 We were out the door by 7:45AM and headed for the Blue Lagoon (where my parents had offered to treat us for the day). The regular Blue Lagoon pool had sold-out of tickets. Instead, we were ‘forced’ to … travel iceland
Iceland Day 4: Southeast Coast & Diamond Beach 2019-12-30 Britney and I awoke several times in the night to the frigid cold. We would run across the street to a patch of trees where we could relieve ourselves and get away from the streetlights. I looked up … iceland travel
Iceland Day 3: Thingvellier & Disaster 2019-12-29 We slept in a little later this morning (930AM), had Skyr parfaits with mom and dad for breakfast and more croissants from Brauð & Co. We drove back to the UNESCO World heritage site, … iceland travel
Iceland Day 2: Golden Circle & Snowmobiling 2019-12-28 I grabbed an assortment of croissants from Brauð & Co, half a block from our apartment. We were on the road headed East by 7:05AM. The morning was strikingly dark. Clouds obscured any starlight. A … iceland travel
Iceland Day 1: Landing & City Tour 2019-12-27 My parents, Britney and I landed in Reykjavik at 630AM. We’d taken an eight-and-a-half-hour overnight flight from Seattle. It was dark when we stepped off the plane and would remain dark until 11AM … travel iceland