How does one calculate the average of two integers, say i and)? Trivial you would say: it is (i + j) /2. Mathematically, that's correct, but it can overflow when i andj are either very large or very small when using fixed-width integers in C-based languages (like Java). Many other languages like Lisp and Python do not have this problem. Avoiding overflow when using fixed-width integers is important, and many subtle bugs occur because of this problem.
In his popular blog post [1], Joshua Bloch (Java expert and author of books onJava intricacies) writes about how a bug [2] in binarySearch and mergeSort algorithms was found in his code injava.uti1.A"ays class injDK It read as follows:
The bug is in line 6- hint mid = (low + high) / 2;". For large values of 'low' and 'high', the expression overflows and becomes a negative number (since 'low' and 'high' represent array indexes, they cannot be negative).
However, this bug is not really new-rather, it is usually not noticed. For example, the classic K & R book [3] on C has the same code (pg 52). For pointers, the expression (low + mid) /2 is wrong and will result in compiler error, since it is not possible to add two pointers. So, the book's solution is to use subtraction (they are pointers, they can never be negative). This is also a solution for the overflow problem we discussed on Java. Is there any other way to fix the problem? If 'low' and 'high' are converted to unsigned values and then divided by 2, it will not overflow, as in:
But Java does not support unsigned numbers. Still, Java has an unsigned right shift operator (>>> )-it fills the right-most shifted bits with 0 (positive values remain as positive numbers; also kno:wn as 'value preserving'). For the Java right shift operator> >, the sign of the filled bit is the value of the sign bit (negative values remain negative and positive values remain positive; also known as 'sign¬preserving'). Just as an aside far C/C++ programmers:
C/C++ has only the» operator and it can be sign or value preserving, depending on implementation. So we can use the >>> operator in Java: The result of (low + high), when treated as unsigned values and right-shifted by 1, does not overflow!
Interestingly, there is another nice 'trick' to finding the average of two numbers: (i & j) + (i 1\ j) /2.
This expression looks strange, doesn't it? How do we get this expression? Hint: It is based on a well-known Boolean equality, for example, as noted in [4]: "(A AND B) + (AORB) = A + B = (AXOR B) + 2 (A AND Br. A related question: How do you detect overflow when adding two ints? It's a very interesting topic and is the subject for next month's column.




Reply With Quote
Bookmarks