Posted on 21-06-12, 01:46 in SPC700 divide logic
Post: #1 of 1
Since: 06-12-21

Last post: 1048 days
Last view: 884 days
While looking at the SPC700 code of higan/bsnes I noticed this in the divide instruction:
//otherwise, the quotient won't fit into VF + A
//this emulates the odd behavior of the S-SMP in this case
A = 255 - (ya - (X << 9)) / (256 - X);
Y = X   + (ya - (X << 9)) % (256 - X);

This nerd-sniped me with the question of what causes this behaviour. After a bit of puzzling I've managed to figure out what the hardware is probably doing, and I thought I might as well share that here in case anyone is interested. Here's a C++ implementation that makes sense from a hardware point of view and reproduces the results of higan's code (tested for all possible inputs) without having any special cases:
// 8-bit rotate left with carry
static inline bool rol( uint8_t &n, bool c )
{
        bool c_out = (n >> 7);
        n = (n << 1) | c;
        return c_out;
}

// spc700 division step (divide 9-bit by 8-bit with 1-bit quotient)
// inputs:
//      y = bits 0-7 of numerator
//      y8 = bit 8 of numerator
//      x = denominator
// outputs:
//      y = remainder
//      return = quotient
// correctness requires that numerator < 2*denominator (i.e. quotient < 2)
static inline bool divstep( uint8_t &y, bool y8, uint8_t x ) {
        // assume bit 8 of numerator-denominator indicates whether the subtraction underflowed.
        // this assumption is correct if input condition is met.
        if( (y < x) ^ y8 )
                return false;
        y -= x;
        return true;
}

// spc700 divide 16-bit by 8-bit with 9-bit quotient
// inputs:
//      y = msb of numerator
//      a = lsb of numerator
//      x = denominator
// outputs:
//      y = remainder
//      a = bits 0-7 of quotient
//      return = bit 8 of quotient (overflow flag)
// correctness requires that numerator < 512*denominator (i.e. quotient < 512)
bool spc700div( uint8_t &y, uint8_t &a, uint8_t x )
{
        bool y8 = false;  // temporary bit 8 of y
        y8 = rol( y, rol( a, divstep( y, y8, x ) ) );  // 16-bit remainder, 1-bit quotient
        y8 = rol( y, rol( a, divstep( y, y8, x ) ) );  // 15-bit remainder, 2-bit quotient
        y8 = rol( y, rol( a, divstep( y, y8, x ) ) );  // 14-bit remainder, 3-bit quotient
        y8 = rol( y, rol( a, divstep( y, y8, x ) ) );  // 13-bit remainder, 4-bit quotient
        y8 = rol( y, rol( a, divstep( y, y8, x ) ) );  // 12-bit remainder, 5-bit quotient
        y8 = rol( y, rol( a, divstep( y, y8, x ) ) );  // 11-bit remainder, 6-bit quotient
        y8 = rol( y, rol( a, divstep( y, y8, x ) ) );  // 10-bit remainder, 7-bit quotient
        y8 = rol( y, rol( a, divstep( y, y8, x ) ) );  //  9-bit remainder, 8-bit quotient
        return       rol( a, divstep( y, y8, x ) );    //  8-bit remainder, 9-bit quotient
}

This basically does a fixed 9-step long division to yield the 9 quotient bits. Each long division step examines 9 bits of the remainder (in y8 and y) and normally reduces it to 8 bits while computing 1 bit of the quotient (which is shifted into the bottom of YA). However if numerator >= 512*denominator then the input condition of the division step is not met, causing the remainder to not be reduced properly and resulting in a nonsense quotient. It does remain true that the returned remainder is numerator - denominator * returned quotient, although that value may not even fit in 8 bits (in which case it just wraps).
    Main » zmatt » List of posts
    Get an ad blocker.