Wednesday, 8 October 2008

Byte-Sized Chunks

I mentioned previously that the original code I wrote a couple of weeks ago to emulate the 6502 registers, and the memory emulation, were all Byte-based. It made sense at the time - the 6502 is an 8-bit CPU, and everything it does is in 8-bit units of work (except for the Program Counter register, which is 16-bits but is really just a pair of 8-bit registers joined together). Making the emulation use Byte entities to represent all these things seemed logical, particularly as that meant I could let the .NET CLR take care of overflow/underflow scenarios - i.e. if the value in a register (for example) went above 255, it would automatically 'wrap around' because a byte can't hold a value greater than that.

However, as the emulator started to take shape and actually do stuff, I noticed an increasing number of situations where I was doing things that required more than 8 bits, and then having to cast back to the underlying Byte of the register (or memory location). A good example of this is where I've got to right now in terms of active progress - the ADC instruction. ADC (Add with Carry) is actually quite a complicated little instruction to emulate, and I was doing various things with Int-sized objects before cramming the result back into the Accumulator registers' Byte. That's just one example though - there were several other places where I was doing explicit or implicit casts between Bytes, Ushorts and Ints.

Is this a bad thing? Inherently, no - the language is designed to enable this kind of transition from certain types to others, otherwise the cast functionality wouldn't be there. But there's a performance penalty to pay for every cast that occurs, and although it's a tiny cost that normally makes no difference, it all adds up. And when you're emulating a CPU that has to be able to execute a million cycles per second, it can add up to a major performance hit. To get an idea of how fast this code has to run, think about the System.Timers.Timer object - it has a minimum interval resolution of 1ms (one millisecond) which means if you set a timer running at that rate, it'll trigger the 'tick' event a thousand times a second. Which is pretty fast; but then consider that to emulate the 6502 at it's standard 1MHz clock speed, you have to be simulating a thousand CPU cycles on every tick of that 1ms timer.

The standard C# Int is 32 bits wide, which is massively overkill for an 8-bit CPU emulation. But equally, the Intel and AMD silicon we're running the language on these days has a default work-unit width of 32 bits as well (unless you've got a 64-bit processor, of course) and Windows is geared to 32-bits too (unless you're running a 64-bit version on your 64-bit processor, naturally). In other words, 32 bits is the 'comfortable' work-unit that the C# CLR, .NET, Windows, and the hardware underneath it all like to use. A series of performance tests I did with variables of a variety of types from Byte (8 bits) up to Int (32 bits) showed not a vast difference from slowest to fastest, but nevertheless a measurable one - Ints come out fractionally faster (in milliseconds) when huge numbers of tiny operations (like increments and decrements) are happening in a tight loop.

So right now I've suspended work on instruction implementation to go back and change all my Bytes to Ints. This gives me a marginal performance improvement in itself, but also means I never have to cast down to Byte during instruction execution - another performance gain. The downside is that I have to do the value overflow and underflow handling myself, but this is actually quite a quick operation (we just AND the value with 255 when changing it) and is still faster overall than using Bytes.

0 comments: