So, I was trying to get a wxPerl wrapper around the wxMozilla widget, which is a Wx wrapper for an embedded mozilla rendering engine. Should be simple, right?
Hah, I wish.
I got lucky, in a way, starting the module on my Linux box. It turns out that Linux is the only place that the mozilla embedding interface actually works. Everywhere else (or at least on OS X and Windows) it doesn't. At all. Apparently the mozilla interface kept changing, leaving them dancing to try and keep up, and it just wasn't worth it. (I found this when I tried getting it all working on OS X -- The embedding interface had a bunch of changes between Firefox 1.0.x and 1.5. WxMozilla worked with the 1.0.x version of the embedding interface, but Firefox 1.0.x won't compile at all on OS X 10.4)
Anyway. Wx::Mozilla, the perl module to let you use mozilla's HTML rendering widget in a wxPerl program, is up to version 0.05, and on CPAN. It only works on Linux, unless by some miracle someone hacks up the WxMozilla code to build on OS X or Windows. (Yes, I know, there's a win installer for WxMozilla -- it doesn't work)
Wx::WebKit's on CPAN too, for folks that want to play with OS X's native WebKit HTML engine from within WxPerl programs.
Nah, this has nothing at all to do with compilers or virtual machines or anything, but my $CONSULT_JOB needs a good cross-platform HTML widget with javascript capabilities to use in a WxPerl based app. There isn't one, of course, so I went ahead and wrapped the WxMozilla widget for use in perl with the Wx toolkit. The result, Wx::Mozilla 0.03, is on CPAN if you need that sort of thing.
I think it works -- I can instantiate the widget, sling things at it, get data back, and generally fiddle with it, but I'm far from a Wx hacker, so I'm not actually 100% sure it's working. (I have no idea how to set the size of a widget contained in another widget, for example. Oh, the shame.... :) If it's the sort of thing you might need, or if you do Wx perl hacking for fun (or money) and want to grab it and give it a go, I'd certainly appreciate it.
And no, I've not got it working on Windows, just on my debian Linux system. Haven't yet figured out how to get the windows Wx build thingie to find the DLLs and header files to make it work. I do windows even less than I do Wx... (So if you hack that up, I'd love ya for it too)
Y'know, there are days I realize I should just give up trying to avoid clever things I don't want to do and just do them and get it over with.
This is one of those days. I'm working basic math into the tornado engine this afternoon. Nothing fancy, scalars only, but I've got four basic data types: 4 and 8 byte integers, 8 byte floats, and bignums. (Currently bignums are punted to the GNU MP library, but that'll go away in a while because of licensing issues) Results are the largest data type, so an Int4+Int8 produces an Int8, and an Int4+Float8 gives a Float8. (Yes, I know, there are issues of dropping bits with floats so it's not quite the same, but if you've got one float in the math already you've pretty much guaranteed things are fuzzy)
So I'm poking around in the files here and start throwing the code in. Data types all have to be figured out at runtime, since I don't want to annotate the bytecode enough to make sure that compilers aren't lying about the code that's produced (and I'm not comfortable designing an annotation system that's secure in the face of people trying to be evil), which means... well, it means switch statements. Lots of 'em. Or if ladders, but pretty much the same difference. And that's just nasty.
I really didn't want to put in MMD. I'd decided not to, actually, because of the basic type system that tornado has. (Not even any objects here) But even in the face of just a handful of types, it's easier to bite the bullet and go MMD. Which I didn't want to do.
Dammit, this sucks. (Even if it is ultimately handy) It'll probably be a lot more interesting when it comes time to dealing with the vectors, though.
One thing that always caused me no end of headaches was dealing with data and threads with Parrot. Parrot was designed to be a mostly single-threaded system with the option to spawn off multiple threads, mostly because the languages that parrot was dealing with were, for the most part, single-threaded, and where they were multithreaded they were multithreaded really badly. Tornado's different -- it's supposed to be massively multithreaded, which definitely changes the way you want to look at data.
In a mostly single-threaded case you want mutable data, since that's fastest. Having to create a new X every time you modify the old X gets slow. Lots of time's spent allocating data, copying data, and cleaning up the old copies of data. Bleah. (The Java folks are well aware of this) The downside there is that if you've data that multiple threads share you need to lock that data even if you're not going to modify it, since you can't be sure that another thread's going to modify it.
In the multithreaded case, you want immutable data, at least as much as possible. Multiple threads can access the data safely with no synchronization, since there's no way the data can change, and that's fine.
The choice of mutable vs immutable data really is one of relative threadedness -- the more multithreaded you are, the more the synchronization costs outweigh the copying and GC costs. Basically you've got to take an educated guess about how things are going to work (or trot out your particular dogma) and choose. Then you get to deal with the repercussions of that choice.
For tornado, we're going with basically immutable data. Machine operations generate new data elements, rather than reusing old ones, and variable actions are all explicitly stateless. (That is, you fetch data out of the variable store, but once you do the data you've fetched has no attachment to the variable name any more, and another thread could store something different into that variable and you'd not know, which is different than what Parrot did) That simplifies the engine design a lot, and removes many potential race conditions, which is a Good Thing.
Interestingly, this also means that the engine's pretty inherently SSA, which makes a number of bytecode optimizations possible that weren't with parrot, because parrot had such a massive number of potential side-effects. That's a separate topic, though.
Tornado's also going with a generic by-reference data scheme, that is all data is accessed by reference, and a data element has a type and payload component. That makes things a little simpler, since there's only one 'real' data type, a pointer to a data element. Going for variable-sized data elements (4 byte int, 8 byte int, float, string, whatever) would allow things to be a smidge faster, but we're constrained there by our guarantees of safety as an embedded engine, which means we've got to be a little paranoid for now, since we can't guarantee that a bytecode program is correctly generated, and we don't have the tools we need to allow for unfettered code running that'd allow us to trap illegal operations. (Damn that lack of MMU and kernel mode access!)
That's OK, we can still manage.