details of FDIV bug in Pentium
"taken from comp.arch"
In article 10362 -8 at 8- vitsemi.com, coe -8 at 8- vitsemi.com (Tim Coe)
writes:
Much of the following explanation was previously
posted to comp.sys.intel. A more complete tentative
software model of the Pentium divider that explains
all divide errors that I am aware of is included with
this post.
On a Packard Bell P90 PC I performed the following
calculation using Microsoft Windows Desk Calculator:
(4195835 / 3145727) * 3145727
The result was 4195579.
This represents an error of 256 or one part in ~16000.
ak -8 at 8- ananke.s.bawue.de (Andreas Kaiser) writes
>Usually, the division is correct (what did you expect?). Just a few
>operands are divided wrong. My results (P90) with ~25.000.000.000
>random arguments (within 1..2^46), with even results divided by two
>until odd, to assure unique mantissa patterns (the binary exponent
>doesn't care, of course).
>
> 3221224323
> 12884897291
> 206158356633
> 824633702441
> 1443107810341
> 6597069619549
> 9895574626641
> 13194134824767
> 13194134826115
> 13194134827143
> 13194134827457
> 13194138356107
> 13194139238995
> 26388269649885
> 26388269650425
> 26388269651561
> 26388276711601
> 26388276712811
> 52776539295213
> 52776539301125
> 52776539301653
> 52776539307823
> 52776553426399
>
> Gruss, Andreas
>
>--------------------
>-- Andreas Kaiser -- internet: ak -8 at 8- ananke.s.bawue.de
>-------------------- fidonet: 2:246/8506.9
Analysis of these numbers reveals that all but 2 of them are of
the form:
3*(2^(K+30)) - 1149*(2^(K-(2*J))) - delta*(2^(K-(2*J)))
where J and K are integers greater than or equal to 0,
and delta is a real number that has varying ranges depending
on J but can generally be considered to be between 0 and 1.
The 2*J terms in the above equation leads to the conclusion
that the Pentium divider is an iterative divider that computes
2 bits of quotient per cycle. (This is in agreemnent with
the quoted 39 cycles per extended long division from the
Pentium data book. The technical name for this type of
divider is radix 4)
The extremely low probability of error (1 in 10^10) implies
that the remainder is being held in carry save format. (Carry
save format is where a number is represented as the sum of
two numbers. This format allows next remainder calculation
to occur without propagating carries. The reason that carry
save format is implied by the error probability is that
it is very difficult but not impossible to build up long
coincident sequences of ones in both the sum word and the
carry word.)
I assumed the digit set was -2, -1, 0, 1, and 2. (Having
5 possible digits in a radix 4 divider allows a necessarry
margin for error in next digit selection. When doing long
division by hand the radix 10 and 10 possible digits allow
no margin for error.)
Taking the above into consideration I wrote the tentative
model of Pentium divide hardware included below so that I
might watch what bit patterns developed in the remainder.
After running the numbers that were known to fail and numbers
near them that appeared not to fail I determined the
conditions for failure listed in the program.
Analysis of the precise erroneous results returned on the
bad divides indicates that a bit (or bits) is being subtracted
from the remainder at or near its most significant bit.
A modeling of this process is included in the program.
The program accurately explains all the published
errors and accurately predicted the error listed at the
beginning of the article.
The determination of the quotient from the sequence of digits
is left as an exercise for the reader ;-).
I would like to thank Dr. Nicely for providing this window
into the Pentium architecture.
-Tim Coe coe -8 at 8- vitsemi.com
An example run of the program (using the first reported
error):
---Enter dividend mantissa in hex: 8 <return>
---Enter divisor mantissa in hex: bfffffb829 <return>
---next digit 1
---1111000000000000000000000001000111110101101111111111111111111100
---0000000000000000000000000000000000000000000000000000000000000100
---11110000000000000000000000010001 iteration number 1
---.
---.
---.
---next digit -1
---0011111111100100101011110100110000010111010000000000000000000000
---1101111111111111111110110110010010010000000000000000000000000000
---00011111111001001010101010110000 iteration number 14
---next digit 2
---A bug condition has been detected.
---Enter 0 for correct result or 1 for incorrect result: 1 <return>
---0000000001101101010100001000000111110110011111111111111111111100
---1111111100100101010110100110010010010010000000000000000000000100
---11111111100100101010101011100101 iteration number 15
---next digit 0
---1111110100100000001010111001010110010001111111111111111111100000
---0000000100101010100000000000010010010000000000000000000000100000
---11111110010010101010101110011001 iteration number 16
---.
---.
---.
#include <stdio.h>
main()
{
unsigned r0, r1, r2, r3, r4, r5, r6, s0, s1;
unsigned t0, t1, t2, t3, cycle, f, incorrect;
unsigned thr_m2_m1, thr_m1_0, thr_0_1, thr_1_2, positive, errornum;
char line[30], *linepoint;
r0 = 0x0bffffc0;
r1 = 0;
r2 = 0x0800bf60;
r3 = 0;
printf("Enter dividend mantissa in hex: ");
scanf("%s", line);
linepoint = line;
while (*linepoint != '\0') linepoint++;
while (linepoint < line + 15) *linepoint++ = '0';
*(line+15) = '\0';
sscanf(line+7, "%x", &r3);
*(line+7) = '\0';
sscanf(line, "%x", &r2);
printf("Enter divisor mantissa in hex: ");
scanf("%s", line);
linepoint = line;
while (*linepoint != '\0') linepoint++;
while (linepoint < line + 15) *linepoint++ = '0';
*(line+15) = '\0';
sscanf(line+7, "%x", &r1);
*(line+7) = '\0';
sscanf(line, "%x", &r0);
r4 = 0;
r5 = 0;
/* These thresholds are VERY tentative. */
/* There may be bugs in them. */
t0 = r0 >> 22;
/* Next threshold is strongly indicated */
/* by the failure of 9895574626641 */
if (t0 < 36) thr_0_1 = 3;
/* Next threshold is strongly indicated */
/* by the failure of 824633702441 */
else if (t0 < 48) thr_0_1 = 4;
else if (t0 < 56) thr_0_1 = 5;
else thr_0_1 = 6;
thr_m1_0 = 254 - thr_0_1;
if (t0 < 33) thr_1_2 = 11;
else if (t0 < 34) {
printf("This model does not correctly handle\n");
printf("this divisor. The Pentium divider\n");
printf("undoubtly handles this divisor correctly\n");
printf("by some means that I have no evidence\n");
printf("upon which speculate.\n");
exit();
}
else if (t0 < 36) thr_1_2 = 12;
else if (t0 < 39) thr_1_2 = 13;
/* Next threshold is strongly indicated */
/* by the failure of 1443107810341 */
else if (t0 < 42) thr_1_2 = 14;
else if (t0 < 44) thr_1_2 = 15;
else if (t0 < 48) thr_1_2 = 16;
else if (t0 < 54) thr_1_2 = 18;
else if (t0 < 60) thr_1_2 = 20;
else thr_1_2 = 23;
thr_m2_m1 = 254 - thr_1_2;
/* Further error conditions may exist. */
/* I believe they could be accom- */
/* adated by adding conditions to the */
/* following clause. */
if (t0 == 35) errornum = 22;
else if (t0 == 41) errornum = 26;
else if (t0 == 47) errornum = 30;
else errornum = 128;
incorrect = 0;
cycle = 1;
/* The cycle count was chosen to keep */
/* the errors on my 60 line screen and */
/* would be 33 or 34 for extended long. */
while (cycle < 27) {
t0 = 255 & ((r2 >> 24) + (r4 >> 24));
if ((t0 > thr_m1_0) || (t0 < thr_0_1)) {
s0 = 0;
s1 = 0;
positive = 0;
printf("next digit 0\n");
}
else if (t0 > thr_m2_m1) {
s0 = r0;
s1 = r1;
positive = 0;
printf("next digit -1\n");
}
else if (t0 < thr_1_2) {
s0 = ~r0;
s1 = ~r1;
positive = 4;
printf("next digit 1\n");
}
else if (t0 & 128) {
s0 = (r0 << 1) | (r1 >> 31);
s1 = r1 << 1;
positive = 0;
printf("next digit -2\n");
}
else {
s0 = ~((r0 << 1) | (r1 >> 31));
s1 = ~(r1 << 1);
positive = 4;
printf("next digit 2\n");
if ((t0 == errornum) && (((r2 >> 21) & 7) == 7) &&
(((r4 >> 21) & 7) == 7)) {
printf("A bug condition has been detected.\n");
printf("Enter 0 for correct result or 1 for incorrect result:
");
scanf("%d", &incorrect);
if (incorrect) {
if (errornum == 22) s0 = s0 - (3 << 25);
else s0 = s0 - (4 << 25);
}
}
}
t0 = s0 ^ r2 ^ r4;
t1 = s1 ^ r3 ^ r5;
t2 = (s0 & r2) | (s0 & r4) | (r2 & r4);
t3 = (s1 & r3) | (s1 & r5) | (r3 & r5);
r2 = (t0 << 2) | (t1 >> 30);
r3 = t1 << 2;
r4 = (t2 << 3) | (t3 >> 29);
r5 = (t3 << 3) | positive;
t0 = r2;
f = 32;
while (f--) {
if (t0 & (1 << 31)) putchar('1');
else putchar('0');
t0 = t0 << 1;
}
t0 = r3;
f = 32;
while (f--) {
if (t0 & (1 << 31)) putchar('1');
else putchar('0');
t0 = t0 << 1;
}
putchar('\n');
t0 = r4;
f = 32;
while (f--) {
if (t0 & (1 << 31)) putchar('1');
else putchar('0');
t0 = t0 << 1;
}
t0 = r5;
f = 32;
while (f--) {
if (t0 & (1 << 31)) putchar('1');
else putchar('0');
t0 = t0 << 1;
}
putchar('\n');
t0 = r2 + r4;
f = 32;
while (f--) {
if (t0 & (1 << 31)) putchar('1');
else putchar('0');
t0 = t0 << 1;
}
printf(" iteration number %d\n", cycle++);
}
}
Adrian Cockcroft - adrian.cockcroft -8 at 8- corp.sun.com - Mailstop: UPAL1-431
SMCC Product Marketing Engineering - Phone 415 336 0615 - Fax 415 336 0648
Sun Microsystems, 2550 Garcia Avenue, Mountainview, CA 94043, USA