2010-04-28

Why Floating-Point Arithmetic is Just Wrong?!

....or why 0.1 + 0.1 + 0.1 - 0.3 is NOT equal to zero !

I know that floating point operations in computer languages are inherently wrong! In fact every seasoned developer knows this; often with some painful memories of bug-hunting sessions in the past. But newcomers might be suprised by this behavior.

So, let me first explain what is wrong with the floating point numbers? Let's use Python as our guinea pig:


>>> f1 = 0.1
>>> f2 = f1+f1+f1
>>> f3 = 0.3
>>> f4 = f3 - f2
>>> f4
-5.5511151231257827e-017

Oops! Not what one expects I guess. The reason for the above result can be traced back to:

>>> f = 0.1
>>> f
0.10000000000000001

So, the 'float' structure cannot accurately hold the value of 0.1 with perfect precision. Actually what it holds is 0.100000000000000005551115123125 to be exact, but Python string representation shows only the first 17 decimal digits.
 
This is not a bug, but a 'feature' of float structure. To represent real numbers with a limited amount of memory, it uses a special bit structure. The details are not much important. But if you want to know you can check out this excellent explanation and this Wikipedia article.

IEEE754 Double Floating Point Format

What counts is that floating point precision is not perfect and tiny amounts of inaccuracies can build up in a long sequence of operations. A similar result is that one cannot and should not compare two float numbers for equality; always try to use '>' and '<' comparisons instead of '='s to avoid ugly bugs.

Another important thing is that when the numbers get big in value, the precision that can be provided for the decimal part becomes smaller as it has only a finite amount of bits to use.

So, that's a quick background on the subject. The reason I am now thinking about is that my viewshed analysis code is fully working now, and I am trying to assess if I should have used alternative more precise methods instead of floating point arithmetics. This led to some research on the subject in the Python world as a newbie. I'm not suprised Python has some kind of 'decimal' structure for arbitrary precision arithmetics.

Python Decimal class provides a solution for the precision needs where using 'float' is not acceptable. So, what are the reasons to use and not to use 'float'. We can quickly work out a list for that:

When to use 'FLOAT'
  • When very high precision levels are not needed
  • When perfect equality between calculated values are not checked for
  • When the numbers are not very huge in value
  • When speed is more important than accuracy
When NOT to use 'FLOAT' (and use 'DECIMAL')
  • When a literal value must be represented by the perfectly same number
  • When very high precision levels are needed
  • When very big numbers should be represented
  • When the precision level should be adjustable or fixed to a certain digits
  • When scientifically perfect rounding should be applied using significant digits in operations
  • When doing financial calculations involving money values
  • When accuracy is more important than speed
I will not go into how to use Python decimal in this post. But the following are some starting points about multiprecision arithmetic for the curious:

So, should I use Decimal in Python? Based on the above criteria the answer is 'no, not really'. Because the algorithm is already running very slowly. But still I plan to implement the decimal calculations as an alternative option in order to see if the differences are big enough to make a difference. I'll let you know about the results in both speed and accuracy terms.

MadChuckle

summary: 'perfection is a b*tch!'
mood: let down

    No comments:

    Post a Comment