I've been refactoring some of the deepest layers of the emulator over the last few days, prompted by an increasing dissatisfaction with the encapsulation of various data structures that the CPU logic needs in order to be able to do anything meaningful. As an example, let's consider the memory emulation object which presents a memory map to the CPU for it to do stuff with.
MemoryMap is a pretty simple class - an array of MemoryCell objects (which describe the type of memory - RAM/ROM/etc - and some other metadata, together with the actual content of the cell) and a variety of methods to get and set these properties. There are also a couple of routines to do things like dump the array into a readable form for debugging, and provide a mechanism for loading data into chunks of cells during initialisation.
The problem I had was that most of this was exposed publicly, because a lot of my work-in-progress testing needed me to be able to get into the object from way up the application in the topmost Main() test area. It all worked, but it was bad practice, and I needed to sort it out. So I tore MemoryMap to pieces and put it back together in a nice encapsulated way, with everything either Private or Internal apart from the accessor methods, and then marked the class as Sealed. And whilst I was doing that, I took the opportunity to rework some of the functionality so that things like formatted output and bulk load initialisation is now done outside the class via calls into it. I also implemented an Indexer on the object, which makes the most common form of access (i.e. to read and write cell contents) a whole lot nicer.
Here's how it used to work: accesses to memory cell contents were done through bespoke methods (called, for geeky amusement, PEEK, POKE, DEEK and DOKE) which took as parameters the location and value (for POKE/DOKE) to set the contents, or just a location (for PEEK/DEEK) to return the contents. PEEK and POKE handled 8-bit byte values, and DEEK and DOKE accepted (or returned) 16-bit byte pairs and then called PEEK/POKE to do the two byte accesses as needed. To access a memory location, the CPU core logic would issue one of these:
_memory.Poke(location, value8);
_memory.Doke(location, value16);
int value8 = _memory.Peek(location);
int value16 = _memory.Deek(location);All well and good, but the new Indexer makes this a thing of the past. An indexer basically makes something in the object it belongs to referrable via an index. So accesses to the memory cell contents now look like this:_memory[location] = value8;
_memory[(long)location] = value16;
int value8 = _memory[location];
int value16 = _memory[(long)location];
I actually have two indexers, one which accepts an INT index, and one which accepts a LONG. Refer to the memory object with an INT and you do a single-byte access; use a LONG instead and you do a double-byte access. The bespoke methods are gone, and naturally this all works a good deal faster. How fast? Well, in order to get a meaningful metric, I had to run a hundred million iterations of a read/write combination of accesses - yes, 100,000,000. Single-byte accesses took 1.25 seconds, and double-byte accesses took 3.12 seconds. Here's what the code looks like:

Now whilst I was doing this refactoring work, I was testing performance as I went along to see whether the new code was slower, faster, or the same as the old. It became apparent pretty quickly that it was faster, which was nice, but I noticed something else as well - the validation I was doing on the index value incurred a notable overhead. The principal objective of the validation was to prevent accesses to the underlying MemoryCell array with out-of-bounds values - i.e. given a standard 64K memory map, we would not want location (index) values lower than zero or higher than 65,535. So we need to either filter them out and return some sort of error code, or just have them 'wrap around' so that location 65,536 is interpreted as zero, 65,537 as one, and so on.
There are two ways of doing this - we either push all incoming location values through a filter method, to reject or wrap out-of-bounds values, or we allow the access to happen with whatever value we're given but trap any resultant IndexOutOfRangeException that the CLR spits out. Both have merits and drawbacks - you can argue for either approach with an equally strong case - but they share a common issue in that they slow down access to the array. In some performance tests I did, these validation techniques dropped array access speed by up to 100% depending on the variation I was timing.
Which brings us to the elementary question that so often troubles software engineers these days: how much code should I write to create an iron-clad, bullet-proof interface that won't allow nonsense values through to the delicate structures underneath, whilst at the same time maintaining a high level of performance? There are, of course, several schools of thought, and it often depends on the nature of the application (and the specific area of that application) in question:
1. Wrap it in steel. Validation, exception-handling, fail-safe defaults, you name it. Take advantage of whatever you can make use of in your language of choice to make it impossible (or at least very difficult) for a user of your code to accidentally (or deliberately) break something and either pervert the execution of the code, 0r crash it altogether. Performance is a secondary concern - if the code is fragile, performance will be the least of your worries.
2. Wrap the surface layers in steel, because that's where strangers will be interfacing with your code. Stop them from doing naughty things through that interface, and try to extend the security and safety features all the way down to the deepest levels of the code. Any time something breaks, manage the error in such a way that the software can either recover from it, or exit gracefully and tell someone what went wrong.
3. As point two above, but in specific controlled conditions where you have 99% confidence of the state of the system, and where performance is critical, relax the regulations a bit. So if you have a method deep inside your code that is nowhere near the exposed interface, which takes two values as parameters from another method that has already validated them, and has to do something to those values as fast as possible, then it might be OK to skip further validation or exception-handling because the chances of an error are very slim.
It's a tricky one. In my day job, option one applies unquestionably. Money rests on my software doing it's job, and although performance is nice, it's better to be right than fast in most cases. And anyway, 'fast' is a relative term, and most business software can be classed as fast if it delivers the result in a second or two.
But way down in my MemoryMap class, where a millisecond is an eternity, things are different. Here we have to be as fast as we can, and the design of the entire software structure should be able to rely on that speed. The quid-pro-quo is that the speed is dependant on known states, and that means no passing values that are out of bounds. It's a kind of contract - MemoryMap says it'll process memory accesses as fast as it possibly can, in exchange for nice index values. Play nasty, and MemoryMap will have the rest of the system crashing down before you can say 'BRK'.
The upshot is that in order to get maximum speed out of the class, I've had to make an exception (haha, see what I did there?) to my rule, and forget exception handling and validation. Fortunately, I know that index values for memory accesses will only be coming via the CPU core, and therefore will be through 8-bit or 16-bit register operations, and are consequently guaranteed to be 'in range'. Equally, this class is not public and cannot be inherited, so there's no danger of anything other than the CPU core talking to it.
It's a risk, but a calculated one.
There are two ways of doing this - we either push all incoming location values through a filter method, to reject or wrap out-of-bounds values, or we allow the access to happen with whatever value we're given but trap any resultant IndexOutOfRangeException that the CLR spits out. Both have merits and drawbacks - you can argue for either approach with an equally strong case - but they share a common issue in that they slow down access to the array. In some performance tests I did, these validation techniques dropped array access speed by up to 100% depending on the variation I was timing.
Which brings us to the elementary question that so often troubles software engineers these days: how much code should I write to create an iron-clad, bullet-proof interface that won't allow nonsense values through to the delicate structures underneath, whilst at the same time maintaining a high level of performance? There are, of course, several schools of thought, and it often depends on the nature of the application (and the specific area of that application) in question:
1. Wrap it in steel. Validation, exception-handling, fail-safe defaults, you name it. Take advantage of whatever you can make use of in your language of choice to make it impossible (or at least very difficult) for a user of your code to accidentally (or deliberately) break something and either pervert the execution of the code, 0r crash it altogether. Performance is a secondary concern - if the code is fragile, performance will be the least of your worries.
2. Wrap the surface layers in steel, because that's where strangers will be interfacing with your code. Stop them from doing naughty things through that interface, and try to extend the security and safety features all the way down to the deepest levels of the code. Any time something breaks, manage the error in such a way that the software can either recover from it, or exit gracefully and tell someone what went wrong.
3. As point two above, but in specific controlled conditions where you have 99% confidence of the state of the system, and where performance is critical, relax the regulations a bit. So if you have a method deep inside your code that is nowhere near the exposed interface, which takes two values as parameters from another method that has already validated them, and has to do something to those values as fast as possible, then it might be OK to skip further validation or exception-handling because the chances of an error are very slim.
It's a tricky one. In my day job, option one applies unquestionably. Money rests on my software doing it's job, and although performance is nice, it's better to be right than fast in most cases. And anyway, 'fast' is a relative term, and most business software can be classed as fast if it delivers the result in a second or two.
But way down in my MemoryMap class, where a millisecond is an eternity, things are different. Here we have to be as fast as we can, and the design of the entire software structure should be able to rely on that speed. The quid-pro-quo is that the speed is dependant on known states, and that means no passing values that are out of bounds. It's a kind of contract - MemoryMap says it'll process memory accesses as fast as it possibly can, in exchange for nice index values. Play nasty, and MemoryMap will have the rest of the system crashing down before you can say 'BRK'.
The upshot is that in order to get maximum speed out of the class, I've had to make an exception (haha, see what I did there?) to my rule, and forget exception handling and validation. Fortunately, I know that index values for memory accesses will only be coming via the CPU core, and therefore will be through 8-bit or 16-bit register operations, and are consequently guaranteed to be 'in range'. Equally, this class is not public and cannot be inherited, so there's no danger of anything other than the CPU core talking to it.
It's a risk, but a calculated one.

1 comments:
Another interesting post about the development of your emulator, and really great to see a code snippet in there as well. Really fascinating stuff.
Post a Comment