I mentioned previously that the original code I wrote a couple of weeks ago to emulate the 6502 registers, and the memory emulation, were all Byte-based. It made sense at the time - the 6502 is an 8-bit CPU, and everything it does is in 8-bit units of work (except for the Program Counter register, which is 16-bits but is really just a pair of 8-bit registers joined together). Making the emulation use Byte entities to represent all these things seemed logical, particularly as that meant I could let the .NET CLR take care of overflow/underflow scenarios - i.e. if the value in a register (for example) went above 255, it would automatically 'wrap around' because a byte can't hold a value greater than that.
However, as the emulator started to take shape and actually do stuff, I noticed an increasing number of situations where I was doing things that required more than 8 bits, and then having to cast back to the underlying Byte of the register (or memory location). A good example of this is where I've got to right now in terms of active progress - the ADC instruction. ADC (Add with Carry) is actually quite a complicated little instruction to emulate, and I was doing various things with Int-sized objects before cramming the result back into the Accumulator registers' Byte. That's just one example though - there were several other places where I was doing explicit or implicit casts between Bytes, Ushorts and Ints.
Is this a bad thing? Inherently, no - the language is designed to enable this kind of transition from certain types to others, otherwise the cast functionality wouldn't be there. But there's a performance penalty to pay for every cast that occurs, and although it's a tiny cost that normally makes no difference, it all adds up. And when you're emulating a CPU that has to be able to execute a million cycles per second, it can add up to a major performance hit. To get an idea of how fast this code has to run, think about the System.Timers.Timer object - it has a minimum interval resolution of 1ms (one millisecond) which means if you set a timer running at that rate, it'll trigger the 'tick' event a thousand times a second. Which is pretty fast; but then consider that to emulate the 6502 at it's standard 1MHz clock speed, you have to be simulating a thousand CPU cycles on every tick of that 1ms timer.
The standard C# Int is 32 bits wide, which is massively overkill for an 8-bit CPU emulation. But equally, the Intel and AMD silicon we're running the language on these days has a default work-unit width of 32 bits as well (unless you've got a 64-bit processor, of course) and Windows is geared to 32-bits too (unless you're running a 64-bit version on your 64-bit processor, naturally). In other words, 32 bits is the 'comfortable' work-unit that the C# CLR, .NET, Windows, and the hardware underneath it all like to use. A series of performance tests I did with variables of a variety of types from Byte (8 bits) up to Int (32 bits) showed not a vast difference from slowest to fastest, but nevertheless a measurable one - Ints come out fractionally faster (in milliseconds) when huge numbers of tiny operations (like increments and decrements) are happening in a tight loop.
So right now I've suspended work on instruction implementation to go back and change all my Bytes to Ints. This gives me a marginal performance improvement in itself, but also means I never have to cast down to Byte during instruction execution - another performance gain. The downside is that I have to do the value overflow and underflow handling myself, but this is actually quite a quick operation (we just AND the value with 255 when changing it) and is still faster overall than using Bytes.
However, as the emulator started to take shape and actually do stuff, I noticed an increasing number of situations where I was doing things that required more than 8 bits, and then having to cast back to the underlying Byte of the register (or memory location). A good example of this is where I've got to right now in terms of active progress - the ADC instruction. ADC (Add with Carry) is actually quite a complicated little instruction to emulate, and I was doing various things with Int-sized objects before cramming the result back into the Accumulator registers' Byte. That's just one example though - there were several other places where I was doing explicit or implicit casts between Bytes, Ushorts and Ints.
Is this a bad thing? Inherently, no - the language is designed to enable this kind of transition from certain types to others, otherwise the cast functionality wouldn't be there. But there's a performance penalty to pay for every cast that occurs, and although it's a tiny cost that normally makes no difference, it all adds up. And when you're emulating a CPU that has to be able to execute a million cycles per second, it can add up to a major performance hit. To get an idea of how fast this code has to run, think about the System.Timers.Timer object - it has a minimum interval resolution of 1ms (one millisecond) which means if you set a timer running at that rate, it'll trigger the 'tick' event a thousand times a second. Which is pretty fast; but then consider that to emulate the 6502 at it's standard 1MHz clock speed, you have to be simulating a thousand CPU cycles on every tick of that 1ms timer.
The standard C# Int is 32 bits wide, which is massively overkill for an 8-bit CPU emulation. But equally, the Intel and AMD silicon we're running the language on these days has a default work-unit width of 32 bits as well (unless you've got a 64-bit processor, of course) and Windows is geared to 32-bits too (unless you're running a 64-bit version on your 64-bit processor, naturally). In other words, 32 bits is the 'comfortable' work-unit that the C# CLR, .NET, Windows, and the hardware underneath it all like to use. A series of performance tests I did with variables of a variety of types from Byte (8 bits) up to Int (32 bits) showed not a vast difference from slowest to fastest, but nevertheless a measurable one - Ints come out fractionally faster (in milliseconds) when huge numbers of tiny operations (like increments and decrements) are happening in a tight loop.
So right now I've suspended work on instruction implementation to go back and change all my Bytes to Ints. This gives me a marginal performance improvement in itself, but also means I never have to cast down to Byte during instruction execution - another performance gain. The downside is that I have to do the value overflow and underflow handling myself, but this is actually quite a quick operation (we just AND the value with 255 when changing it) and is still faster overall than using Bytes.

0 comments:
Post a Comment