bboard

zmatt

Posted on 21-06-12, 01:46 in SPC700 divide logic

Post: #1 of 1
Since: 06-12-21

Last post: 1419 days
Last view: 1254 days

While looking at the SPC700 code of higan/bsnes I noticed this in the divide instruction:

//otherwise, the quotient won't fit into VF + A
//this emulates the odd behavior of the S-SMP in this case
A = 255 - (ya - (X << 9)) / (256 - X);
Y = X + (ya - (X << 9)) % (256 - X);

This nerd-sniped me with the question of what causes this behaviour. After a bit of puzzling I've managed to figure out what the hardware is probably doing, and I thought I might as well share that here in case anyone is interested. Here's a C++ implementation that makes sense from a hardware point of view and reproduces the results of higan's code (tested for all possible inputs) without having any special cases:

// 8-bit rotate left with carry
static inline bool rol( uint8_t &n, bool c )
{
bool c_out = (n >> 7);
n = (n << 1) | c;
return c_out;
}

// spc700 division step (divide 9-bit by 8-bit with 1-bit quotient)
// inputs:
// y = bits 0-7 of numerator
// y8 = bit 8 of numerator
// x = denominator
// outputs:
// y = remainder
// return = quotient
// correctness requires that numerator < 2*denominator (i.e. quotient < 2)
static inline bool divstep( uint8_t &y, bool y8, uint8_t x ) {
// assume bit 8 of numerator-denominator indicates whether the subtraction underflowed.
// this assumption is correct if input condition is met.
if( (y < x) ^ y8 )
return false;
y -= x;
return true;
}

// spc700 divide 16-bit by 8-bit with 9-bit quotient
// inputs:
// y = msb of numerator
// a = lsb of numerator
// x = denominator
// outputs:
// y = remainder
// a = bits 0-7 of quotient
// return = bit 8 of quotient (overflow flag)
// correctness requires that numerator < 512*denominator (i.e. quotient < 512)
bool spc700div( uint8_t &y, uint8_t &a, uint8_t x )
{
bool y8 = false; // temporary bit 8 of y
y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 16-bit remainder, 1-bit quotient
y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 15-bit remainder, 2-bit quotient
y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 14-bit remainder, 3-bit quotient
y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 13-bit remainder, 4-bit quotient
y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 12-bit remainder, 5-bit quotient
y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 11-bit remainder, 6-bit quotient
y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 10-bit remainder, 7-bit quotient
y8 = rol( y, rol( a, divstep( y, y8, x ) ) ); // 9-bit remainder, 8-bit quotient
return rol( a, divstep( y, y8, x ) ); // 8-bit remainder, 9-bit quotient
}

This basically does a fixed 9-step long division to yield the 9 quotient bits. Each long division step examines 9 bits of the remainder (in y8 and y) and normally reduces it to 8 bits while computing 1 bit of the quotient (which is shifted into the bottom of YA). However if numerator >= 512*denominator then the input condition of the division step is not met, causing the remainder to not be reduced properly and resulting in a nonsense quotient. It does remain true that the returned remainder is numerator - denominator * returned quotient, although that value may not even fit in 8 bits (in which case it just wraps).

Main » zmatt » List of posts

[Your ad here? Why not!]