After I wrote that earlier post, I spent the rest of the day in an Arthur Dent sort of mode - things kept wandering around in my head looking for other things to connect with. Mainly to do with the timing metrics I was talking about, and the rather nice improvement of the emulated clock speed from 2.4MHz to 7MHz. As is so often the case, having made such a gain, I wasn't satisfied. I kept thinking that although 7MHz was pretty good, it still on reflection seemed a little sluggish for what is a fairly simple linear C# program running on a dual-core multi-gigahertz PC. Maybe my timing mechanism was a bit inaccurate...?
I thought about what it was I was measuring; a simple counter, incrementing once each pass through the main loop, named coreCycles. Seems pretty straightforward - every 100ms I stash the counter and reset it, and then at the end of the run I average-out the last 10 stashed values, divide it by a cool million, and thus we have cycles per second. Doing that, I'm now coming out at 6.977MHz, or just shy of 7MHz.
But hold on - I also mentioned previously that the future intent is to account for the actual number of cycles that each instruction really needs in which to execute. Which means that right now, what I'm effectively counting is just one cycle per instruction - the fetch cycle in which we pull the opcode from memory, for example. So rather than counting cycles per second, what I'm actually counting is instructions per second! Looking at it from that angle, 7 million instructions per second isn't bad.
And when I plugged-in the actual clock cycles per instruction, things looked even more interesting. Now I'm accounting for the right number of cycles per instruction (minus the oddities like branches needing one more) the counter is incrementing properly. And after a few executions of the emulator to get an average, our actual, proper, accurate tally of cycles per second gives us an emulated clock speed of...
.
.
...I'm happy with that. It'll do. ;)
I thought about what it was I was measuring; a simple counter, incrementing once each pass through the main loop, named coreCycles. Seems pretty straightforward - every 100ms I stash the counter and reset it, and then at the end of the run I average-out the last 10 stashed values, divide it by a cool million, and thus we have cycles per second. Doing that, I'm now coming out at 6.977MHz, or just shy of 7MHz.
But hold on - I also mentioned previously that the future intent is to account for the actual number of cycles that each instruction really needs in which to execute. Which means that right now, what I'm effectively counting is just one cycle per instruction - the fetch cycle in which we pull the opcode from memory, for example. So rather than counting cycles per second, what I'm actually counting is instructions per second! Looking at it from that angle, 7 million instructions per second isn't bad.
And when I plugged-in the actual clock cycles per instruction, things looked even more interesting. Now I'm accounting for the right number of cycles per instruction (minus the oddities like branches needing one more) the counter is incrementing properly. And after a few executions of the emulator to get an average, our actual, proper, accurate tally of cycles per second gives us an emulated clock speed of...
40MHz
..
.

3 comments:
So if 'woot' was the word last time, when you got '7MHz', what is it now?
I think we'll definitely have to upgrade to 'l33t' or something... ;)
awesome :D
Post a Comment