I've been working on $CONSULTING_GIG (ex-$WORK_GIG)'s compiler today, digging into a performance issue that it's got. It doesn't show in too many places -- most of the forms are really snappy -- but when it does... ouch.
The problem, in this case, is one of constants.
Now, constants are one of those facts of life, and that's fine. I rather like them, and when given a bit of a chance there's all sorts of interesting things one can do with a program in a compiler. Other than some simple constant folding, I didn't do any of them, mostly for time reasons.
The way I implemented constants was pretty straightforward, since that's easiest to debug. Because all the binary operations had to be overloaded, and I didn't want an explosion of overload functions, the compiler generates PMCs of the appropriate language type for the constants, something that is certainly quite common -- just because the source says:
a + 4
you don't necessarily want a low-level integer for that 4, since you may have to allow people to overload operations in the integer class and have that affect integer constants. (You can do this in ruby, for example) This is no big deal.
What I'd done is taught the compiler to emit the appropriate code to create a new PMC and give it a value before each use, so the above statement would read in PIR something like:
I16 = find_type "Int32"
P16 = new I16
P16 = 4
P17 = global 'a'
P18 = new I16
P18 = P16 + P17
and go on from there. No big deal. Except... my victim program of the moment was taking something like two seconds a line item to pull up and display data, which was nasty. Yeah, sure, that was accessing a Postgres database hosted on a windows box over an 802.11b wireless link, but still, nasty. (It was 5 seconds a line item before I optimized the generated SQL -- Postgres has a hard time with "field LIKE $1" and index usage, since the query optimizer gets the query to optimize but not the value of the placeholder variables, so things get messy, but that's a separate issue)
Since nothing big and obvious was taking all the time, it was time to start in on the hopeful stuff and see where we went. In this case, constant creation seemed an obvious spot to look. Besides the cost of creating all the new PMCs every time, there's the disposal cost to allow for as well, since all those temp values need to be cleaned up by the garbage collector. Which it does, and reasonably well, but it's a lot faster to not do something than to do it.
The worry was that creation would be cheap enough that it was less than the cost of looking the variables up in the global namespace, which is a definite worry as parrot's hashing code has been sub-swell in the past. Still, it only took a day to make the changes and test everything out to make sure it worked.
Net result? Reusing constants rather than recreating them cut the time the victim program took to load up its line items from 17 seconds down to 11. Not bad, all things considered, and definitely surprising. I'd expected to shave off a second or so, maybe, which still would've been worth the effort, but not 6. That was nice.
So, net thing learned? Pre-generating constants is often worth it. (Now I'd love to find out what I could shave off the run time by changing all those by-name lookups to by-offset lookups...)