Monday, 19 March 2012

Income distributions and Additional Rate taxpayers

Richard Murphy (whom I really ought to leave alone after this post) notes that the HMRC tables of income tax projections imply a 9.1% increase in taxpayers with incomes over £150k in 2011-12 compared with 2010-11, whereas the numbers had been fairly flat over the previous four years.  On this evidence he accuses the Treasury of "fraudulent manipulation of data".  (He seems to use the terms 'Treasury' and 'HMRC' interchangeably.)

Fraud, whether intended in a legal sense or not, is a serious charge.  Murphy ought to be more considerate of HMRC's employees: one reason is that many them are members of trade unions affiliated to the TUC, for which he wrote his recent report.

It's also an implausible charge.  What are the incentives for such a fraudulent manipulation of data?  It's hard to see how they could outweigh the disincentive of potentially being found out and sacked.  I very much doubt that there is the sort of culture of dishonesty at HMRC that would make a fraudster feel safe.

Murphy says that the projected increase "looks wrong" and (in the comments) "is utterly implausible".  I don't share his faith in his gut instincts: I'd rather take a look at the data.  I suppose that HMRC has got a model which starts with an income distribution and applies some income inflation (which need not be the same at all income levels) and some increase or decrease in numbers (which again need not be the same at all income levels).  So I've had a try at reproducing the distribution for 2010-11.

As Murphy reports, "It is assumed that 9.1% more people come into this bracket that year, and earn 9.6% more as a result".  That is, there is almost no change in the average income of people in the bracket.  That's characteristic of the Pareto (power-law) distribution often used to model the upper tail of incomes.  If one assumes that all incomes over £150k follow such a distribution, then to fit the average of those incomes, which is £344k, one needs a power of 1.773.  This gives the following results:
Range  Actual Number  Modelled Number 
150k-  146,000  131,00
200k-  143,000  158,00
500k-   26,000   27,000
1mn+   13,000   11,000
It's not a perfect fit, but it's not terrible. Extending the distribution backwards by another 30,000 taxpayers (i.e. the 9.1% growth) I get to an income of £142.8k. So an income growth of 5.04% would be required to give the projected growth in the number of taxpayers with incomes over £150k.

The literature on income distributions suggests that they are well fitted by a power-law distribution in the upper tail and by a lognormal distribution elsewhere.  The power-law distribution does seem to be favouring the £200k+ range over the £150-200k range, so I tried fitting a lognormal distribution to the data, and got this:
Range  Actual Number  Modelled Number 
6475-  933,000  951,00
7500- 2,610,000 2,808,000
 10k- 6,380,000 6,122,000
 15k- 5,160,000 5,267,000
 20k- 6,910,000 6,923,000
 30k- 5,690,000 5,484,000
 50k- 2,120,000 2,233,000
100k-  344,000  198,000
150k-  146,000  30,000
200k-  143,000  9,000
500k-   26,000     27
1mn+   13,000      0
It's about the right shape except for the high-end tail, where it falls off too quickly.

Finally, I tried fitting the sum of a lognormal distribution and a Pareto tail applying from 1k up:
Range  Actual Number  Modelled Number 
6475-  933,000  923,000
7500- 2,610,000 2,763,000
 10k- 6,380,000 6,116,000
 15k- 5,160,000 5,308,000
 20k- 6,910,000 6,992,000
 30k- 5,690,000 5,505,000
 50k- 2,120,000 2,195,000
100k-  344,000  362,000
150k-  146,000  109,000
200k-  143,000  138,000
500k-   26,000   36,000
1mn+   13,000   27,000
It's not unexpected that simply adding distributions doesn't work perfectly, and one wouldn't in any case expect a perfect fit to real world data (I suppose that HMRC data for 2010-11 are at least influenced by reality).  Anyway, for this fit I need to go down to an income level of £142k to get an extra 30,000 taxpayers, implying an income growth rate of 5.7%.

So having fitted the data two ways, we find an implied income growth rate for high earners of between 5.0% and 5.7%.  How implausible is that?  CPI annual growth peaked at 5.2% in September 2011 and RPI annual growth at 5.6%.   Private sector wage inflation peaked at 3.4% in June 2011.  However, income inflation is not uniform, and the CEBR reports that bankers' salaries have risen faster.  Overall, I would be somewhat surprised if HMRC were proved right about the numbers with incomes over £150k in 2011-12, because it's not obvious why this year should be different from previous years.  But I think it would be easy to create a plausible model of income distributions and growth that reproduces HMRC's prediction.  HMRC may be wrong, but there is no reason to think it dishonest.

Oh, and Murphy should stop telling people "you really do have to improve your maths".

No comments:

Post a Comment