zmatt |
Posted on 21-06-12, 01:46 in SPC700 divide logic
|
Post: #1 of 1
Since: 06-12-21 Last post: 1271 days Last view: 1107 days |
While looking at the SPC700 code of higan/bsnes I noticed this in the divide instruction:
//otherwise, the quotient won't fit into VF + A //this emulates the odd behavior of the S-SMP in this case A = 255 - (ya - (X << 9)) / (256 - X); Y = X + (ya - (X << 9)) % (256 - X); This nerd-sniped me with the question of what causes this behaviour. After a bit of puzzling I've managed to figure out what the hardware is probably doing, and I thought I might as well share that here in case anyone is interested. Here's a C++ implementation that makes sense from a hardware point of view and reproduces the results of higan's code (tested for all possible inputs) without having any special cases:
// 8-bit rotate left with carry static inline bool rol( uint8_t &n, bool c ) { bool c_out = (n >> 7); n = (n << 1) | c; return c_out; } // spc700 division step (divide 9-bit by 8-bit with 1-bit quotient) // inputs: // y = bits 0-7 of numerator // y8 = bit 8 of numerator // x = denominator // outputs: // y = remainder // return = quotient // correctness requires that numerator < 2*denominator (i.e. quotient < 2) static inline bool divstep( uint8_t &y, bool y8, uint8_t x ) { // assume bit 8 of numerator-denominator indicates whether the subtraction underflowed. // this assumption is correct if input condition is met. if( (y < x) ^ y8 ) return false; y -= x; return true; } // spc700 divide 16-bit by 8-bit with 9-bit quotient // inputs: // y = msb of numerator // a = lsb of numerator // x = denominator // outputs: // y = remainder // a = bits 0-7 of quotient // return = bit 8 of quotient (overflow flag) // correctness requires that numerator < 512*denominator (i.e. quotient < 512) bool spc700div( uint8_t &y, uint8_t &a, uint8_t x ) { bool y8 = false; // temporary bit 8 of y y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 16-bit remainder, 1-bit quotient y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 15-bit remainder, 2-bit quotient y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 14-bit remainder, 3-bit quotient y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 13-bit remainder, 4-bit quotient y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 12-bit remainder, 5-bit quotient y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 11-bit remainder, 6-bit quotient y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 10-bit remainder, 7-bit quotient y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 9-bit remainder, 8-bit quotient return rol( a, divstep( y, y8, x ) ); // 8-bit remainder, 9-bit quotient } This basically does a fixed 9-step long division to yield the 9 quotient bits. Each long division step examines 9 bits of the remainder (in y8 and y) and normally reduces it to 8 bits while computing 1 bit of the quotient (which is shifted into the bottom of YA). However if numerator >= 512*denominator then the input condition of the division step is not met, causing the remainder to not be reduced properly and resulting in a nonsense quotient. It does remain true that the returned remainder is numerator - denominator * returned quotient, although that value may not even fit in 8 bits (in which case it just wraps). |