Yet Again, Forgetting the Mix
I like reading Zero Hedge, though their laudable cynicism about government and financial markets sometimes edges into conspiracy theory.
Anyway, I wanted to highlight something in a post there today about BLS data. Various writers at the site have claimed for years that government economic data is being manipulated. I am not sure I buy it -- I distrust government a lot but am not sure their employees could sustain such a fraud over months and years. And besides, once you manipulate data one time to juice some metric, you have to keep doing it or the metric just reverses the next month. Corporations that play special quarter-end inventory games to increase reported sales learn this very quickly. Where there are apparent errors, I am much more willing to assume incompetence than conspiracy.
The example this week is from the BLS payrolls data, and I will quote from the article and show their chart:
Another way of showing the July to August data:
- Goods-Producing Weekly Earnings declined -0.8% from $1,118.68 to $1,109.92
- Private Service-Providing Weekly Earnings declined -0.1% from $868.80 to $868.18
- And yet, Total Private Hourly Earnings rose 0.2% from $907.82 to %909.19
What the above shows is, in a word, impossible: one can not have the two subcomponents of a sum-total decline, while the total increases. The math does not work.
Certainly this is an interesting catch and if I were producing the data I would take these observations as a reason to check my work. But the author is wrong to say that this is "impossible". The reason is that these are not, as he says, two sub-components of a sum. They are two sub-components of a weighted average. Total private average weekly earnings is going to be the goods producing weekly average times number of goods producing hours plus service producing weekly average times the number of service producing hours all over the total combined hours.
From this I hope you can see that even if the both sub averages go down, the total average can go up if the weights change. Specifically, the total average can still go up if there is a mix shift from service providing to goods producing hours, since the average weekly wages of the latter are much higher than the former. I will confess it would have to be a pretty big jump in mix. The percent goods producing hours would have to rise from 15.6% to almost 17%, which strikes me as a very large jump for one month. So I am not claiming this is what happened, but people miss the mix changes all the time. I had to explain it constantly back in my corporate days. Another example here.
mharris717:
Could also involve a decline in hours worked, independent of any hourly salary change, right?
October 6, 2017, 9:06 ammharris717:
Putting aside the specifics here, looking for a change in the mix is something I remember to do sometimes now, after reading you talk about it several times. Thanks!
October 6, 2017, 9:07 amcc:
Some further examples of the mix:
October 6, 2017, 9:20 amStocks: The DOW for example is NOT the same set of stocks over long periods. Companies that go out of business or shrink are dropped and new companies are added in. This is NEVER mentioned when people show long term histories. Most of the companies in the DOW in 1950 are not in it now.
Average income: retired people have generally a lower income and part of their income is social security so it is not wages so their wage income goes to zero. If they are included in national income statistics and there are more old people, it will look like wages/income are going down. But old people are not trying to send kids to college or save for retirement and may have paid off their house so this is doubly misleading.
brandonberg:
Worth pointing out that this phenomenon (change in a weighted average having a different sign than the change of all its components due to a change in the relative size of the components) is called Simpson's Paradox.
October 6, 2017, 10:05 amMercury:
It's more significant that the labor participation rate has plunged to around 1970s levels, most new jobs are crappy, service sector jobs like waiters/fast food but official unemployment rate is at record lows.
October 6, 2017, 10:27 amdavidcobb:
Heard on the news this morning that Ind./Mnfg. job growth has avg. around 7000 per mon. for the past 10 yrs. The last three months have avg. 48000.
October 6, 2017, 1:05 pmmlhouse:
The BLS reported a loss of 33,000 jobs in September 2017 based on their survey of jobs.
But if you look at their own data https://www.bls.gov/web/empsit/cpseea04.htm
and total up the subtotals, the number of people employed increased by 1.283 million. People unemployed declined by 577. The difference being new people in the labor force that were added to the number of employed.
Again, looking at the BLS's own stats, there are 3.155 million more people employed this September vs. last September, and 1.368 million fewer unemployed. Almost all of these increases happened after January, 2017.
Ok, I undestand that the job report is based on survey data but how does an increase in employment of 1.3 million people correspond to a -33 job creation? With more than 3 million more people employed since Jan 1, 2017 why is the BLS reporting much less? That is an average of 333,000 jobs "created" a month, yet there hasn't been a single report of these numbers. Here is how the BLS reports the month to month job changes https://www.bls.gov/web/empsit/ceseesummary.htm: 216 232 50 207 145 210 138 169 -33.
So, the average number of jobs created based on their own data is 333,000. The total "job report" shows 1.134 million "jobs created", or an average of 126,000 per month.
If you look at the last four months of 2016 from the first table, the difference in employement from Sept to December 2016 is 180. Yet, the BLS reports job gains of 249, 124, 164, 155. That is almost 4 times the increase in employment that their own data suggests.
It would be different if the discrepencies were not coming from their own data. Maybe there is somethng magical in their survey compilation. But then why the huge discrepencies with their own data?
October 6, 2017, 6:35 pmQ46:
The vast quantity of data and number of sources mean it can never be accurate or complete. Such reports are by their nature fiction, manipulating them could not improve on such fiction.
October 7, 2017, 4:34 amDaveK:
...sometimes edges into conspiracy theory.
Sometimes? Well, perhaps the articles posted there "sometimes" tilt that way, but the comments section of virtually any ZH article can be counted on to promote that vibe. Sometimes it's entertaining to read, but at least for me the entertainment value wears thin very quickly.
October 7, 2017, 8:32 amMW:
Hi Warren,
Reading the same article I had the same thought, went through the same calculations and came to the same conclusion - both on the possibility of this being an example of Simpson's paradox and as a reminder to resists ZH's tendency to spin conspiracies out of incompetence.
But only now do I realise that the reason it struck me so quickly was because you've written on this before.
It's a small thing, but as a a reader for the last few years I want to take the opportunity to thank you for writing here - it's helped me be both a little smarter and a little wiser, and for that you have my gratitude.
Cheers!
October 8, 2017, 12:27 pmmarque2:
It could be the seasonal adjustments. They seasonally adjust the numbers, because there are certain employment peaks and troughs in labor and nonfarm labor. This time of year employment picks up for harvest activities , in November and December, there is a massive increase in employment for Holiday sales. In January there is a natural drop because the holidays are over. BLM tries to zero out these "seasonal effects" so you could see raw data showing a massive employment bump in December, and still have a job loss shown in the official employment stats.
In recent years, these adjustments have been out of whack with reality.
Also note, you are confusing some things. The September jobs report is for net changes in September over August, not since the beginning of the year.
October 9, 2017, 8:35 ammlhouse:
The data I am looking at is seasonally adjusted.
October 9, 2017, 9:32 ammarque2:
Good to know.
October 9, 2017, 6:27 pmJust Thinking:
I have not studied the current reports, but I think the answer could be in your observation that the survey data is different than the payroll data. About 55000 households are surveyed, and the unemployment rate is determined from them. In comparison, the payroll data comes from Form 790 which employers are supposed to fill out for unemployment insurance reasons. Many companies do not file form 790 on a timely basis in the best of circumstances; some do not know about it; and some do not get around to it. With the hurricanes, it is possible that some companies had other things on their minds besides filing the form. Meanwhile, the household survey may provide a better picture of the employment / unemployment situation.
October 9, 2017, 7:34 pmmlhouse:
But yet they publish the data anyways. As an employer I can tell you there is limited labor available.
October 9, 2017, 7:43 pmJust Thinking:
I am not sure what your point is when you say that "they publish the data anyways." Yes, they did, and they will publish it every month. And they will revise it the following month. It would not surprise me if next month the 33000 decrease turns into an increase when the revision comes out next month.
October 10, 2017, 8:31 pmAlthough I do not employ people, every anecdotal and survey information I have agrees with your statement that limited availability is labor. Unskilled labor is going for $5 to $8 above minimum wage; signs are all over the place asking for help; and the want ads on the internet are ubiquitous.
mlhouse:
Well, if it isn't legitimate data, they publish it anyways. I perosnally think there is a lot of political manipulation in the data done by these agencies. How can one measure go up by more than a MILLION and the other measure go down by 33,000?
October 11, 2017, 6:38 amRobert Rounthwaite:
"Simpson's paradox, ... is a phenomenon in probability and statistics, in which a trend appears in different groups of data but disappears or reverses when these groups are combined."
October 11, 2017, 5:49 pmhttps://en.wikipedia.org/wiki/Simpson%27s_paradox
mharris717:
This was another interesting example recently: https://twitter.com/lukebornn/status/917835676095148032
October 13, 2017, 1:53 pm