My New Favorite Excel Function: INDIRECT
I will confess that somehow I never really learned pivot table mechanics in Excel, so I struggle with three dimensional data. One example might be a spreadsheet with individual tabs where each tab is a different corporate division, and on each tab is a P&L by month (so three dimensions: Month, P&L category, Division). Let's assume the P&L is arranged the same on every page, with, for example, one divisions June's total revenue number in the same cell number as the June total revenue number on every other divisions' tab, ie this field is in cell c8 on every worksheet tab.
Long ago I created a simple way to get a total of a particular cell across all spreadsheets. I would add a spreadsheet tab in the workbook before all the division tabs and another after all the division tabs. Let's say I just name these tabs "Posta" and "Postb". Then the sum of all cells C8 that are located on a spreadsheet tab between tabs "Posta" and "Postb" would be
=SUM(Posta:Postb!C8).
The problem comes when one wants to create a summary worksheet tab that doesn't sum all the values for C8 but summarizes them in a table. Imagine a table where column A is the division name (that matches the name of the tab for that division) and column B is that division's June revenue, ie the value of cell C8 for that division from its individual spreadsheet. The only way I knew how to do this before was manually and tediously.
But laziness is the mother of invention, and I finally encountered a workbook that was so tedious to summarize manually that I had to find another way. I had a spreadsheet of 150 tabs, each worksheet tab being one of our locations containing online customer review scores formatted the same way into the same cells. That is when I found the INDIRECT function. Basically it allows one to craft a custom cell reference in a text string, feed that to the indirect function, which will output the contents of that cell. So if our location names in the first column exactly match the worksheet tab names, then we can write
=INDIRECT("'"&$A3&"'"&"!C$8")
The ampersand symbols are basically text string concatenation operators, and are there to create the text string of a cell reference in the format excel expects. The funny triple quotes is just to add a single quote mark before and after the tab name. This particular string will give us the value of cell C8 that is in the worksheet tab with the name that is in cell A3.
You can also use this cell value from the INDIRECT function in more complicated formulas. For example
=COUNTIF( INDIRECT("'"&$A3&"'"&"!$B$3:$B$500"),D$4)
would look in the spreadsheet tab whose name is in A3 and on that tab count all the values in the range B3 to B500 on that tab that have the value given in cell D4. For example, if D4 is equal to "5" we could be counting all the reviews that had a score of "5".
Postscript: This may be my record for the blog post with the niche-iest audience. Mainly it is aimed at my son, who has the enviable job of being an analyst for a craft beer company in La Jolla. He has learned not to complain much to me about his job, as my first job was in an oil refinery in Baytown, Texas, so my sympathy level is maybe lower than it should be. Anyway, as part of a geeky family, he and I compete on Excel knowledge so this post is mainly my way of counting coup on him.
Update: I totally agree with the comments that a relational database is needed. Unfortunately, at the time I did this work, we had only within weeks been given access to the review data from the government recreation reservations database, and it was all in excel. Faster to do the analysis in excel than to figure out how to read the sheets into something like mysql (of which I am positive there are a million simple tools for doing so). But I will accept it as a challenge for this year.