details of FDIV bug in Pentium



 "taken from comp.arch"
 In article 10362 -8 at 8- vitsemi.com, coe -8 at 8- vitsemi.com (Tim Coe)
 writes:
 Much of the following explanation was previously
 posted to comp.sys.intel.  A more complete tentative
 software model of the Pentium divider that explains
 all divide errors that I am aware of is included with
 this post.
 On a Packard Bell P90 PC I performed the following
 calculation using Microsoft Windows Desk Calculator:
 (4195835 / 3145727) * 3145727
 The result was 4195579.
 This represents an error of 256 or one part in ~16000.
 ak -8 at 8- ananke.s.bawue.de (Andreas Kaiser) writes
 >Usually, the division is correct (what did you expect?). Just a few
 >operands are divided wrong. My results (P90) with ~25.000.000.000
 >random arguments (within 1..2^46), with even results divided by two
 >until odd, to assure unique mantissa patterns (the binary exponent
 >doesn't care, of course).
 >
 >          3221224323
 >         12884897291
 >        206158356633
 >        824633702441
 >       1443107810341
 >       6597069619549
 >       9895574626641
 >      13194134824767
 >      13194134826115
 >      13194134827143
 >      13194134827457
 >      13194138356107
 >      13194139238995
 >      26388269649885
 >      26388269650425
 >      26388269651561
 >      26388276711601
 >      26388276712811
 >      52776539295213
 >      52776539301125
 >      52776539301653
 >      52776539307823
 >      52776553426399
 >
 >      Gruss, Andreas
 >
 >--------------------
 >-- Andreas Kaiser -- internet: ak -8 at 8- ananke.s.bawue.de
 >-------------------- fidonet:  2:246/8506.9
 Analysis of these numbers reveals that all but 2 of them are of
 the form:
 3*(2^(K+30)) - 1149*(2^(K-(2*J))) - delta*(2^(K-(2*J)))
 where J and K are integers greater than or equal to 0,
 and delta is a real number that has varying ranges depending
 on J but can generally be considered to be between 0 and 1.
 The 2*J terms in the above equation leads to the conclusion
 that the Pentium divider is an iterative divider that computes
 2 bits of quotient per cycle.  (This is in agreemnent with
 the quoted 39 cycles per extended long division from the
 Pentium data book.  The technical name for this type of
 divider is radix 4)
 The extremely low probability of error (1 in 10^10) implies
 that the remainder is being held in carry save format.  (Carry
 save format is where a number is represented as the sum of
 two numbers.  This format allows next remainder calculation
 to occur without propagating carries.  The reason that carry
 save format is implied by the error probability is that
 it is very difficult but not impossible to build up long
 coincident sequences of ones in both the sum word and the
 carry word.)
 I assumed the digit set was -2, -1, 0, 1, and 2.  (Having
 5 possible digits in a radix 4 divider allows a necessarry
 margin for error in next digit selection.  When doing long
 division by hand the radix 10 and 10 possible digits allow
 no margin for error.)
 Taking the above into consideration I wrote the tentative
 model of Pentium divide hardware included below so that I
 might watch what bit patterns developed in the remainder.
 After running the numbers that were known to fail and numbers
 near them that appeared not to fail I determined the
 conditions for failure listed in the program.
 Analysis of the precise erroneous results returned on the
 bad divides indicates that a bit (or bits) is being subtracted
 from the remainder at or near its most significant bit.
 A modeling of this process is included in the program.
 The program accurately explains all the published
 errors and accurately predicted the error listed at the
 beginning of the article.
 The determination of the quotient from the sequence of digits
 is left as an exercise for the reader ;-).
 I would like to thank Dr. Nicely for providing this window
 into the Pentium architecture.
 -Tim Coe     coe -8 at 8- vitsemi.com
 An example run of the program (using the first reported
 error):
 ---Enter dividend mantissa in hex: 8 <return>
 ---Enter divisor  mantissa in hex: bfffffb829 <return>
 ---next digit 1
 ---1111000000000000000000000001000111110101101111111111111111111100
 ---0000000000000000000000000000000000000000000000000000000000000100
 ---11110000000000000000000000010001 iteration number 1
 ---.
 ---.
 ---.
 ---next digit -1
 ---0011111111100100101011110100110000010111010000000000000000000000
 ---1101111111111111111110110110010010010000000000000000000000000000
 ---00011111111001001010101010110000 iteration number 14
 ---next digit 2
 ---A bug condition has been detected.
 ---Enter 0 for correct result or 1 for incorrect result: 1 <return>
 ---0000000001101101010100001000000111110110011111111111111111111100
 ---1111111100100101010110100110010010010010000000000000000000000100
 ---11111111100100101010101011100101 iteration number 15
 ---next digit 0
 ---1111110100100000001010111001010110010001111111111111111111100000
 ---0000000100101010100000000000010010010000000000000000000000100000
 ---11111110010010101010101110011001 iteration number 16
 ---.
 ---.
 ---.
 #include <stdio.h>
 main()
 {
 unsigned r0, r1, r2, r3, r4, r5, r6, s0, s1;
 unsigned t0, t1, t2, t3, cycle, f, incorrect;
 unsigned thr_m2_m1, thr_m1_0, thr_0_1, thr_1_2, positive, errornum;
 char line[30], *linepoint;
 r0 = 0x0bffffc0;
 r1 = 0;
 r2 = 0x0800bf60;
 r3 = 0;
 printf("Enter dividend mantissa in hex: ");
 scanf("%s", line);
 linepoint = line;
 while (*linepoint != '\0') linepoint++;
 while (linepoint < line + 15) *linepoint++ = '0';
 *(line+15) = '\0';
 sscanf(line+7, "%x", &r3);
 *(line+7) = '\0';
 sscanf(line, "%x", &r2);
 printf("Enter divisor  mantissa in hex: ");
 scanf("%s", line);
 linepoint = line;
 while (*linepoint != '\0') linepoint++;
 while (linepoint < line + 15) *linepoint++ = '0';
 *(line+15) = '\0';
 sscanf(line+7, "%x", &r1);
 *(line+7) = '\0';
 sscanf(line, "%x", &r0);
 r4 = 0;
 r5 = 0;
     /*  These thresholds are VERY tentative. */
     /*  There may be bugs in them.           */
 t0 = r0 >> 22;
     /*  Next threshold is strongly indicated */
     /*  by the failure of 9895574626641      */
 if (t0 < 36) thr_0_1 = 3;
     /*  Next threshold is strongly indicated */
     /*  by the failure of 824633702441       */
 else if (t0 < 48) thr_0_1 = 4;
 else if (t0 < 56) thr_0_1 = 5;
 else thr_0_1 = 6;
 thr_m1_0 = 254 - thr_0_1;
 if (t0 < 33) thr_1_2 = 11;
 else if (t0 < 34) {
   printf("This model does not correctly handle\n");
   printf("this divisor.  The Pentium divider\n");
   printf("undoubtly handles this divisor correctly\n");
   printf("by some means that I have no evidence\n");
   printf("upon which speculate.\n");
   exit();
   }
 else if (t0 < 36) thr_1_2 = 12;
 else if (t0 < 39) thr_1_2 = 13;
     /*  Next threshold is strongly indicated */
     /*  by the failure of 1443107810341      */
 else if (t0 < 42) thr_1_2 = 14;
 else if (t0 < 44) thr_1_2 = 15;
 else if (t0 < 48) thr_1_2 = 16;
 else if (t0 < 54) thr_1_2 = 18;
 else if (t0 < 60) thr_1_2 = 20;
 else thr_1_2 = 23;
 thr_m2_m1 = 254 - thr_1_2;
     /*  Further error conditions may exist.  */
     /*  I believe they could be accom-       */
     /*  adated by adding conditions to the   */
     /*  following clause.                    */
 if (t0 == 35) errornum = 22;
 else if (t0 == 41) errornum = 26;
 else if (t0 == 47) errornum = 30;
 else errornum = 128;
 incorrect = 0;
 cycle = 1;
     /*  The cycle count was chosen to keep   */
     /*  the errors on my 60 line screen and  */
     /*  would be 33 or 34 for extended long. */
 while (cycle < 27) {
   t0 = 255 & ((r2 >> 24) + (r4 >> 24));
   if ((t0 > thr_m1_0) || (t0 < thr_0_1)) {
     s0 = 0;
     s1 = 0;
     positive = 0;
     printf("next digit 0\n");
     }
   else if (t0 > thr_m2_m1) {
     s0 = r0;
     s1 = r1;
     positive = 0;
     printf("next digit -1\n");
     }
   else if (t0 < thr_1_2) {
     s0 = ~r0;
     s1 = ~r1;
     positive = 4;
     printf("next digit 1\n");
     }
   else if (t0 & 128) {
     s0 = (r0 << 1) | (r1 >> 31);
     s1 = r1 << 1;
     positive = 0;
     printf("next digit -2\n");
     }
   else {
     s0 = ~((r0 << 1) | (r1 >> 31));
     s1 = ~(r1 << 1);
     positive = 4;
     printf("next digit 2\n");
     if ((t0 == errornum) && (((r2 >> 21) & 7) == 7) &&
 (((r4 >> 21) & 7) == 7)) {
       printf("A bug condition has been detected.\n");
       printf("Enter 0 for correct result or 1 for incorrect result:
 ");
       scanf("%d", &incorrect);
       if (incorrect) {
         if (errornum == 22) s0 = s0 - (3 << 25);
         else s0 = s0 - (4 << 25);
         }
       }
     }
   t0 = s0 ^ r2 ^ r4;
   t1 = s1 ^ r3 ^ r5;
   t2 = (s0 & r2) | (s0 & r4) | (r2 & r4);
   t3 = (s1 & r3) | (s1 & r5) | (r3 & r5);
   r2 = (t0 << 2) | (t1 >> 30);
   r3 = t1 << 2;
   r4 = (t2 << 3) | (t3 >> 29);
   r5 = (t3 << 3) | positive;
   t0 = r2;
   f = 32;
   while (f--) {
     if (t0 & (1 << 31)) putchar('1');
     else putchar('0');
     t0 = t0 << 1;
     }
   t0 = r3;
   f = 32;
   while (f--) {
     if (t0 & (1 << 31)) putchar('1');
     else putchar('0');
     t0 = t0 << 1;
     }
   putchar('\n');
   t0 = r4;
   f = 32;
   while (f--) {
     if (t0 & (1 << 31)) putchar('1');
     else putchar('0');
     t0 = t0 << 1;
     }
   t0 = r5;
   f = 32;
   while (f--) {
     if (t0 & (1 << 31)) putchar('1');
     else putchar('0');
     t0 = t0 << 1;
     }
   putchar('\n');
   t0 = r2 + r4;
   f = 32;
   while (f--) {
     if (t0 & (1 << 31)) putchar('1');
     else putchar('0');
     t0 = t0 << 1;
     }
   printf(" iteration number %d\n", cycle++);
   }
 }
 Adrian Cockcroft - adrian.cockcroft -8 at 8- corp.sun.com - Mailstop: UPAL1-431
 SMCC Product Marketing Engineering - Phone 415 336 0615 - Fax 415 336 0648
 Sun Microsystems, 2550 Garcia Avenue, Mountainview, CA 94043, USA