Roll for Sandwich Data Analysis

Roll For Sandwich is a series on TikTok where the host, Jacob Pauwels, rolls dice to select the ingredients for his lunch. Originally it was a daily series, then moved to a Monday/Wednesday/Friday format and has 171 episodes as of July 11, 2023. RFS has inspired dozens of spin-off shows on TikTok and is an all-around delightful pop culture phenomenon.

As a D&D nerd and a sandwich aficionado, I was instantly hooked when I came across the series during season 2 and have been an avid watcher ever since.

As a data crunching dork, I realized this series was an incredible, unique dataset and began to wonder if anyone had bothered to make a spreadsheet of the data. Much to my delight, I found a link to the Google Spreadsheet in the RFS subreddit and made a copy for myself to start playing around with.

Cleaning the Data

The original spreadsheet was optimized for being read by a person, which doesn’t always allow for the easiest time doing data analysis — especially when there are instances where certain cells contained multiple pieces of data [multiple kinds of roughage, cheese, etc].

Initially, I hoped to keep my version of the data as close to the source formatting as possible to allow for future automation of the data import process, but it quickly became evident it would be very difficult to analyze the data in its human-friendly format.

To start, I split the data so each cell contained only one data point. I then began to standardize the spelling and capitalization of each piece of data, as there were several instances of the same ingredient being counted as up to four separate items due to capitalization and pluralization [specifically, onions, which appeared as Onion, onion, Onions, and onions].

I then split the rating column to remove the /10 and converted the data from a text string into a number. Inversely, roll and dice data needed to be converted into a text string in order to properly analyze the data.

The OG Spreadsheet
Initial Data Analysis and Visualization

Analysis in Sheets

My first task was to get a list of all the unique ingredients and count the frequency of their appearances. I wanted the analytics I set up to be able to automatically update as I updated the data, so I chose to use formulas rather than pivot tables.

Because the sets of data were different sizes, across each season, I became very adept at using the FLATTEN function to merge the tabs into computer-friendly chunks of data:

Find each unique type of bread across all 4 tabs of data:
=SORT(UNIQUE(filter(FLATTEN(‘Season 1’!B2:B,‘Season 2’!B2:B,‘Season 3’!B2:B,‘Season 4’!B2:D),FLATTEN(‘Season 1’!B2:B,‘Season 2’!B2:B,‘Season 3’!B2:B,‘Season 4’!B2:D)<>“”)))

Count how many times each type of bread appears in an episode:
=COUNTIF(FLATTEN(‘Season 1’!$B$2:$B,‘Season 2’!$B$2:$B,‘Season 3’!$B$2:$B,‘Season 4’!$B$2:$D),A3)

Once I had created this list, I was able to quickly drill down into the data and churn out several pieces of analysis. I started to notice that additional formatting to the data would be needed to fully analyze all the information I had in mind. For example, episode sponsorships are noted on the spreadsheet, but not in a way that can be read and interpreted easily by a machine. My to-do list started to grow as more ideas for data crunching began to sprout in my mind…

As I began to add more tables and graphs to my spreadsheet, the siren’s call of Tableau grew louder and louder. I didn’t want to move to a new tool because I had already invested so much time and effort making everything work for analysis in a spreadsheet format, but knew if I really wanted to take things to the next level I was holding myself back by arbitrarily restricting myself to Google Sheets.

Pivoting to Tableau

Eventually the call of Tableau was too strong and I had to again reorganize my data to optimize for the intended tool.

I created a new tab in the spreadsheet and merged all four seasons onto a single sheet. I was then able to connect Tableau directly to my Google Sheet to allow it to pull the data directly from the sheet like a database so the dashboards could automatically update as I added data from new episodes without having to remap tables when new seasons are added.

Annoyingly, the FLATTEN function doesn’t work in Tableau, so I had to pivot most of the columns into a pivot data field. I was concerned this field would break when I updated the source spreadsheet then tried to refresh the data source, but was pleasantly surprised that it worked just fine! All the dashboards automatically updated with no issues when I began to add data from new episodes.

Once the data had been properly pivoted and was machine-crunchable again, I started building out visualization dashboards and styled the pages based on the existing RFS branding and colors.

My Tableau-friendly Spreadsheet

Data Visualization Dashboards

Tableau Public embed settings are truly awful and interactive dashboards are sadly not responsive. Please use the link below to navigate to Tableau for the full experience!

Related Projects
Clear Filters
Flight Delay Predictor
2022 World Cup Spreadsheet
2022 World Cup Spreadsheet
Thanksgiving Production Schedule
Thanksgiving Production Schedule
Data Visualization: Unification of Great Britain
Data Visualization: Unification of Great Britain