[This is the third part to the second and final installment of statistical analysis regarding the Rasmussen Factor. The remaining parts will appear in coming days.]

Prices, Statistical Discussion

The 54,244 foals in this group sold for a gross of $2,620,702,526, an average of $48,313, and a median of $13,000. The 3,886 RF qualifiers sold for a gross of $212,132,912, an average of $54,589, and a median of $13,513. The RF average was 12.99 percent above the overall average, and the RF median was 3.95 percent above the overall median. So whichever way you slice it, the RF group should have had better-than-average results based upon the prices for which they sold. It actually had results that were slightly below average.

Most readers can stop right here. The following discussion is not for people who are statistically challenged. It is for readers who enjoy a statistical challenge.

Since the RF average is 12.99 percent higher than the overall average and the RF median is 3.95 percent higher than the overall median, how much better should the RF results have been? The most practical answer is somewhere between 3.95 percent and 12.99 percent. But where exactly? Which figure do you trust more, the average or the median, to correlate with results?

The relationship between prices and results is not linear in the first place. Yes, a group of 1,000 foals that sold for an average of $20,000 will almost invariably have better results than a group of 1,000 foals which sold for an average of $10,000, but they will not be TWICE as good. They might be 10 percent or 20 percent or 30 percent better, but not 100 percent better. This is what I mean when I say that the relationship between price and results is not linear. Higher prices are almost invariably accompanied by better results, especially over a large number of foals sold, but quantifying how much better is difficult, to say the least.

Getting back to averages versus medians, sometimes the two can give conflicting signals. Say a group of five foals sold for $40,000 each. That group will have an average of $40,000 and a median of $40,000. In another group of five foals two sold for $100,000 each and three sold for $20,000 each. That group has an average of $52,000 and a median of $20,000. The second group has a higher average but a lower median than the first group. In that case which group would you expect to produce better results?

I do not know how often such conflicting signals occur in the Thoroughbred marketplace, but I do know that it is perfectly possible, although the larger the number of foals involved, the less likely it is to occur.

The other problem with averages is that above a certain point, say $1,000,000, prices lose all rationality. Was The Green Monkey really worth $16,000,000? I rest my case. At $1,000,000 and above, prices become more a function of human egos than of a rational appraisal of the probable worth of the foal.

For all of these reasons I wanted to develop a single statistic to take the place of averages and medians, or at least to supplement averages and medians. “Necessity is the mother of invention.” I decided to invent a statistical device to reduce prices to ONE number not skewed by the vagaries of the marketplace (averages) and not difficult to determine without a computer for large groups (medians). Here is what I did.

For prices up to $10,000, simply divide by 100. Hence $1,000 = 10, $2,000 = 20, etc. For prices between $10,000 and $1,000,000, take the square root. Hence $10,000 = 100, $40,000 = 200, $90,000 = 300, . . . and $1,000,000 = 1,000.

For prices above $1,000,000 I was originally inclined to keep them all at 1000. Eventually I decided to continue with prices above $1,000,000 but at a slower rate of increase. For example, if I had just continued with square roots, $4,000,000 would equal 2000 and $16,000,000 would equal 4000. I decided to give an extra 100 points for each million above $1,000,000. Hence $4,000,000 equals 1300 (not 2000) and $16,000,000 equals 2500 (not 4000).

So I invented a statistical tool that is sort of a compromise between averages and medians and has the advantage of being calculated like averages (just sum up and divide) and not like medians (which are really a pain if you are not using a computer). I call this statistic the maverage.

The maverage for all 54,244 sale foals was 149.82. Unsquare that root and it corresponds to $22,446. I thought that was interesting because it is closer to the median than to the average. It also suggests that the true midpoint of this marketplace was $20,000-$25,000.

The 3,886 RF qualifiers had a maverage of 156.39, which is 4.385 percent higher than the overall maverage of 149.82. Therefore, I say that the RF qualifiers had a Price Index (PI) of 1.04385. I would expect them to be about 4 percent better than average. They were actually about 2.5 percent worse than average.

I say about 4 percent because the relationship between price and results is not linear. Maverages make the relationship somewhat more linear than averages do. 4.385 percent is closer to 3.95 percent (median) than to 12.99 percent (average). Maverages behave more like medians but can be calculated like averages.

Of course I could have used correlation and regression to determine the exact relationship between prices and expected results for this group of 54,244 sales foals. That is indeed higher math and requires considerably more computing power than I have at my disposal (paper, pen, and calculators).

Roger Lyons

said:Yes, you could have used statistical tools that assumed the relationship to be linear, but that would have collided with your insistence that the relationship is not linear. Actually, the relationship between commercial value and performance probably would submit satisfactorily to a linear model. What you mean, I think, is that differences in earnings are not proportional to differences in commercial value. That doesn’t preclude a linear relationship.