Monday, January 3, 2011

Be Very Afraid Of Floating Point Numbers

What does this statement in the Help mean, exactly?

Values held as DOUBLE are accurate to 15 significant digits, but may be subject to rounding error beyond the fifteenth digit.
In the classroom or laboratory, it probably means this:
Values held as DOUBLE are accurate to 15 significant digits, but may be subject to rounding error beyond the fifteenth digit.
In the real world, where people write programs to do real things with real data, it means this:
Values held as floating point cause all sorts of problems and should be avoided at all costs. When you can't avoid them, convert them. When you can't convert them, don't trust them: Test them, check them, and fix them when you find errors in the data. Do it in your code, don't wait for your users to notice drifting data six months down the road.
Here's an example that doesn't involve any arithmetic, so we can't blame the algorithms:
  • An application written using Microsoft Access has a large number of columns with the "Data Type" defined as "Number" and the "Field Size" set to "Double" instead of "Long Integer" or "Decimal" or something safe and manageable.

  • One of these columns is the primary key for a table with over 3 million rows.

  • The SQL Anywhere 12 migration wizard is used to convert the Access database into a SQL Anywhere database, and it dutifully defines the primary key as DOUBLE... after all, what else can it do? That's what Access says it is.

  • After migrating, most of the primary key values look like integers; they have values like 11, 21 and 3208547315.

  • However, over a million rows have values that look like this: 1020661200.9999999, 1020661224.0000001, 3108547316.0000005 and 3108547313.9999995.
Yes, those are all way out past the fifteenth digit.

I don't care where the problem comes from, Access, SQL Anywhere or the migration process. This kind of crap always happens when you mess with floating point, and in this case I just want it expunged, excised, gone, done with.

I'm not going to take a chance with a situation like this...

SELECT ( SELECT COUNT(*) FROM t WHERE pkey - TRUNCNUM ( pkey, 0 ) = 0 ) AS "Integers",
( SELECT COUNT(*) FROM t WHERE pkey - TRUNCNUM ( pkey, 0 ) <> 0 ) AS "Not Integers";

Integers Not Integers
1,917,541 1,252,134

I'm going to bite the bullet, do this...

SET pkey = ROUND ( pkey, 0 )
WHERE pkey - TRUNCNUM ( pkey, 0 ) <> 0;

1252134 row(s) updated
Execution time: 272.574 seconds

Then I'm going to UNLOAD that table, change the data type from DOUBLE to BIGINT, and reload.

Or do an ALTER TABLE, that might be fun.

Either way, no more DOUBLE.

No more fear.


Rick Regan said...

What are your examples (1020661200.9999999, 1020661224.0000001, 3108547316.0000005 and 3108547313.9999995) supposed to be -- 1020661201, 1020661224, 3108547316 and 3108547314? Those are integers that are exactly representable in a double.

Breck Carter said...

I'm not sure I get your point. Yes, 1020661201 is exactly representable as a double. But it's not equal to 1020661200.9999999, and that's MY point. I believe the creator of the original data intended for it to be 1020661201, and if an integer or decimal data type had been used, that would have been the case. But with floating point stuff like this often happens. FWIW this is data I am DEALING with, not data I designed or entered.

Rick Regan said...

My point is that, if the user entered only integers, like 1020661201, and they were stored as doubles, you wouldn't see the problem you're seeing. Maybe the user really entered some floating-point values?

Breck Carter said...

I seriously doubt that anyone typed .9999999 or .0000001 millions of times. This is real, production data. I don't know how the doubles got messed up (Microsoft Access is my guess, as culprit). But, in my opinion, that's not important. The real mistake was using floating point in the first place, because it allows this kind of imprecision.

Breck Carter said...

Two more points: All the values DISPLAY as integers in Microsoft Access, and the stored values have more than the 15 digits of precision that double supports (at least, on the SQL Anywhere side, according to the docs). That makes it highly unlikely that any human entered this kind of garbage.

Rick Regan said...

OK, so the data are the results of calculations, but you expect them to be integers. Why can't you round them to 15 decimal digits and re-store them as doubles? You'd have to round them anyway to convert to BIGINT, wouldn't you?

Breck Carter said...

I doubt very much whether any application program logic calculated these numbers... they are primary key "identifier" values... but I have no idea, and I am not making any assumptions. Yes, of course they have to be rounded to be stored as bigints. Once stored as bigints, however, this symptom (however caused) will never happen again. And that is my point... floating point data types allow this kind of behavior, integers and decimals do not. Floating point data types have their uses, but in business that happens very rarely... and it is very dangerous to use floating point data types when absolute precision is required (counters, money, etc).

Breck Carter said...

Clarification: When I say "calculated these numbers" I mean "performed some operation involving operators like * and / and cos and tanh". It is possible that an application program used something like i = i + 1 to come up with a new value.

Anonymous said...

I fully agree that double (or any other floating point type) should never be used with key columns. You don't talk about any existing FKs to these double values - so be glad if there aren't any...

That being said, double does have its uses IMHO. And for several APIs, it's far easier to map double to an according programming language datatype than a decimal type - think of ODBC and C/C++ where the double representation is identical in both worlds whereas there's no standard decimal/numeric type in C/C++. But of course, that's not the best choice for values that have to be exact. No doubt.


Anonymous said...

By the way, what's the linkt between double and the Foxhound display image?

Ah, I see: Foxhound is gonna deliver *exact* values, not estimates:)


Breck Carter said...

@Volker: The Foxhound screenshot is showing how the UPDATE...ROUND statement took a long time to run and grabbed 1.2 million row locks... as in "bite the bullet" :) As for foreign keys, there are no explicit foreign keys, but there ARE logical relationships. I have not investigated how to deal with them yet... I fully expect more problems, but at least I know what to look for.

Rick Regan said...

I'm not arguing that doubles are a good choice for a business app or for primary keys. I was just arguing that doubles might not be causing this particular problem. I don't see how these variables became "corrupted" by floating-point without calculations being done on them (if stored in doubles as integers, < 2^53, then left alone, they would still be integers!)

Breck Carter said...

In spite of my earlier guesswork, I too no idea how these values found their way into the Access database. However, I have confirmed that yes, in fact, those exact values are stored. Microsoft Access can only display them as rounded integers, but when you do an Access filter "Equals..." you don't get any hits on the rounded integers, only on the exact numbers with the .0000001 etcetera. This definitely confirms the error was introduced in earlier times, in the Access database. My point is that such an error would not have occurred, COULD not have occurred, if an integer or decimal data type had been used.