"If I asked you to divide 500 by 21 and I asked you whether the answer is greater than one, you would say yes right away," Raghunathan said. "You are doing division but not to the full accuracy. If I asked you whether it is greater than 30, you would probably take a little longer, but if I ask you if it's greater than 23, you might have to think even harder. The application context dictates different levels of effort, and humans are capable of this scalable approach, but computer software and hardware are not like that. They often compute to the same level of accuracy all the time."I got a chuckle as a read this, as I've longed wanted something like this for spreadsheets – not to save power, but rather to save users from their mistaken assumption about the precision of their results. Spreadsheets always make their calculations using high-precision arithmetic, and many, many naive users get fooled into thinking that the resulting numbers are very precise.
I once had a hardware engineer working for me deliver a budget estimate that exhibited this point very nicely. This was someone whose education should have made her less vulnerable to this mistaken notion, but it didn't. She was estimating the cost of piece of gear we were proposing to build a few thousand of. She created a spreadsheet to help with this estimate. Each row of the spreadsheet had a column for quantity needed, another for the estimated unit cost, and then an extended price (quantity times unit cost). Then she added up all the extended costs to get a total cost for the whole project. This total cost was a figure like $293,593.92 – accurate to the nearest penny. When I asked her what the accuracy of that estimate was, she answered that it was spot on. Well, of course it wasn't – those unit costs were estimates, and the accuracy of each estimate was different. When I dug into it with her, I discovered that in some cases the estimate was an actual cost, because she had gotten a quotation from the vendor. In other cases, at the opposite extreme, she'd been unable to locate a vendor quickly – and just made up the cost, a pure guess! She also had made no records of the sources for any of these estimates.
I remember all this very well, because I (stupidly!) took her original estimate to my boss and told him that's what it was going to cost him to build this thing. In the end, it cost nearly twice that – entirely because of the inaccuracy of her estimates. But that 8 digit accuracy in the spreadsheet's math really fooled the engineer – she had a lot of trouble wrapping her brain around the difference between the accuracy of the math and the accuracy of her estimates...
This whole idea also reminds me of one of the things I like about using slide rules: their limited precision. An ordinary handheld slide rule will give you just 3 or 4 digits of precision at best (some of this depends on the user's skill, too). This lack of precision is in your face the whole time you're using slide rules – everything you compute on a slide rule is an estimate with very limited precision. Nonetheless, that level of precision turns out to be plenty good enough for the vast majority of engineering work – something that I've found surprises a great number of younger engineers, who have never in their lives used computing machinery with less than 10 or 11 digits of precision. The whole idea of low precision computation is novel to them :)
Here is a related essay I wrote for our professional journal some years ago and has largely been incorporated into their instructions to authors:
ReplyDeleteESSAY ON NUMERIC PRECISION.
Numeric values for results are commonly reported with too many digits. For example, a study that found that 33 of 100 patients had esotropia might report this as 33.33%. This kind of excess is wrong on two counts. First, it falsely implies that the number (in this case, a percentage) is known with greater precision and accuracy than is true in fact. Second, it makes the study more difficult to read and confuses the reader. It also unnecessarily takes up a small amount of valuable journal space.
The precision of a reported value should not exceed what can be justified by the data. In general, the number of digits in the reported value should not exceed the number of digits in the quantity of measurements (e.g., for 8 subjects, report one digit of precision, for 80 report two digits of precision, for 800 subjects report three digits of precision).
There are a few nuances in numeric precision.
1. By convention for percentages, an exception can be made so that at least two digits are reported for all cases (e.g., two of six subjects would be reported as 33%).
2. A more subtle nuance is the logarithmic nature of the first digit of numeric precision (also see Benford F. The law of anomalous numbers. Proceedings of the American Philosophical Society 1938:78:55172 ). When the first digit is close to the number 1 (e.g., 1 or 2), it may well be worth reporting a value with an extra digit of precision (e.g., when reporting the mean refraction of 80 patients, 1.23 D would be preferable to 1.2 D).
3. If the true precision of a reported value exceeds the precision of its clinical significance or its measurability, it is not worth reporting (e.g., reporting the mean alignment deviation of 4300 patients as 23.59 PD: while correct and justifiable in the analysis, this precision is neither measureable nor significant and should not be reported in the conclusion or abstract sections of the paper).
4. I do not consider it worth reporting precision beyond what can be justified by the standard error of the mean (SEM) for a given value. Thus, if the weight of a group of 600 children has a mean of 31.3 kg with a SEM of 2.6 kg, I would report the mean weight as 31 kg.