I T S M E J E R E M Y

Stacking Data in Google Sheets

Have you ever been working with a spreadsheet that had a bunch of values separated by commas, semicolons, pipes, or new lines in a single column? Something like this?

Recipient Goods
Michaelangelo Pizza, nunchucks, skateboard
Donatello DisplayPort cables, Redbull
Leonardo Kale, Seitan, courage
Raphael knives, punching bag, beef, roller skates, Pizza

It makes it pretty hard to do any kind of analysis on the data, but it can be hard to undo. A better way to represent the same data is like this:

Recipient Good
Michaelangelo Pizza
Michaelangelo nunchucks
Michaelangelo skateboard
Donatello DisplayPort cables
Donatello Redbull
Leonardo Kale
Leonardo Seitan
Leonardo courage
Raphael knives
Raphael punching bag
Raphael beef
Raphael roller skates
Raphael Pizza

That can get fed more easily to visualization and analysis software, but it can be difficult to get from the first dataset to the second without using code, and when I tell some people to use Python scripts for things, they run away from me. (This isn't true, they just get "Zoom connectivity problems" these days).

Today I found a really useful post on Stack Exchange and generalized it a bit to make it way easier to get from data that's all smushed together to data that's nicely stacked for analysis. I'm very excited about it so I'm writing up this how-to.

The first step is to get your data into Google sheets, whether through uploading, importing, copying and pasting, however you like. Just get it there.

Step two is to open up the script editor. You can find this under the "Tools" menu in Google Sheets. We're going to be making a custom formula!

Now, copy this code block and paste it in to the script editor:

function stackrows(range, index, on) {
  var output2 = [];

  // Iterating through spreadsheet rows with iterator i
  for(var i=0, iLen=range.length; i<iLen; i++) {
    var split_cell = range[i][index].split(on);

    // Iterating through each value that's been split with j
    for(var j=0, jLen=split_cell.length; j<jLen; j++) {
      var output1 = [];
      var split_value_j = split_cell[j].trim()

      // Iterating through values in spreadsheet rows with k
      for(var k=0, kLen=range[0].length; k<kLen; k++) {
        if(k == index) {
          output1.push(split_value_j);
        } else {
          output1.push(range[i][k]);
        }
      }
      output2.push(output1);
    }
  }
  return output2;
}

This is a modified version of the excellent script from that Stack Exchange post. Mine is different because it's a bit more flexible, I've added some comments to help myself remember what's happening, and I've renamed a couple of variables. Once you've pasted in the text, save this function in your script editor.

Now there's a new function available in Google sheets. We defined "stackrows", so now we can use it in this spreadsheet. You'll have to copy this formula into the script editor of every spreadsheet you do this to, unless there's a better way I don't know about (very possible).

I prefer to make a new sheet and reference the old sheet with my new formula, but you can opt not to do that if you're okay with messier data that you'll have to selectively copy and paste into something else eventually. Pick out the top left corner of where you want your new data to appear, and start typing the formula =stackrows(. It isn't a built-in function, so it won't auto-complete, but it will work anyways.

I didn't rename any sheets in my sample spreadsheet, so my formula looks like this: =stackrows(Sheet1!A1:B5,1,","). What that's doing is running our new "stackrows" function on the range Sheet1!A1:B5. If you aren't familiar with Google Sheets range syntax, after you type the opening parenthesis in the formula, you can use your mouse to navigate to and select the range you want. Then it's saying to split up data in the second column (column indices start from zero, so a=0, b=1, c=2, etc.). The last thing specified is the separator, in this case a comma. Don't forget to put quotes around it!

That's it! You should see a bit of loading and then get your data in the new sheet. You can even use the new sheet as the input for an even newer sheet if you have a couple of columns with smushed together data.

Rather than fill this with screenshots, I'll just link to a sample spreadsheet that I made that demonstrates how this works. You can make your own copy to poke at it, figure out how it works, and hopefully make use of this process yourself.