Parrot was supposed to have a fully layered, asynchronous IO system, with completion callbacks and layers, something akin to a cross between the SysV (or Tcl) streams system and VMS's asynchronous IO system.
That is, when your program posts a read or write to a filehandle what happens is that the IO request is put into the file handle's queue
Why build in an async IO system from the beginning? Heck, why do async IO in the first place?
The reason to do async IO is simple. Performance. You'll get significantly better throughput with a properly written async IO capable program than you will with one that does synchronous IO. ("significantly better" here being a factor of three or four in some cases. Depends on the hardware to some extent) The nice thing is that, unless you've got a horrifically CPU-intensive program, you'll get a performance win. (More detail's really a subject of a what the heck post, so I'll stop here)
There are two reasons to build one into parrot in the first place. The first is that while it's pretty straightforward to build a synchronous IO system ontop of an async one, it's a big pain to build an async one on top of a synchronous system. (And yes, given how ubiquitous async IO systems aren't, I know that parrot would've had to do this on some platforms)
The second reason's a practical one -- if one wasn't built we stood a good chance of people building three or four (or five, or six, or...) async systems, all of which were incompatible. You can see this in perl now (and in other languages I expect) with event loops -- every GUI's got one, there are usually a few different generic event loops, and an async IO system or two thrown into the mix. It's all messy. (Async IO and event loops don't have to go together, but they can, and usually do, even if it's not obvious in the programming interface that's exposed) Having one standard system would've meant that anyone who needed some sort of asynchronous system (for IO, events, or whatever) could just tie into the standard one, rather than write their own. Easier for the people writing the interface, since there's less code to write, and easier for people using it, since things would mix and match better.
The basic IO system was going to be a standard completion callback with marker system. That is, you could attach a sub to each IO request, and the system would call the sub when the IO request was complete. Each async IO request would also have a marker you'd get back that you could use to figure out what state the request was in -- was it pending, or complete, or in the process of having its completion routine run, or whatever. It was going to tie into the event system under the hood, but that's a post for another day.
Pity, that. Would've been cool, and could've made for some screaming throughput. Ah, well, such is life.
This was one of the things I really wanted to get into parrot. Granted, mainly to support playing Zork, but as a side-effect we would've gotten the capability to load in JVM and .NET bytecode, along with python bytecode, and for the really adventurous platform-native executables. (Or even better, executables for other platforms)
What I'm talking about here is giving the bytecode loading system of parrot the capability to have special-purpose bytecode loading libraries which can be loaded at runtime, as well as the capability of detecting what type of bytecode is being loaded and handing it off to the right loader. Parrot does have a version of this built in, but it's relatively rudimentary.
What I wanted to do was have a general-purpose mechanism in place to allow registering a loader and the conditions under which it would fire, and then have parrot walk the list of loaders every time a file was loaded up. This was already sort of necessary, and sort of implemented, to dispatch based on the extension of the file loaded in -- that's how parrot manages to handle bytecode, pasm, and pir files transparently. It's a bit hardcoded, though, and I'd rather it wasn't.
Why allow runtime additions to the bytecode loading system?
My personal favorite, all jokes aside, is the z-code loader. I fully expect that nobody sane will deploy it in a production environment, but I personally think it'd be really cool to be able to do:
parrot lurkinghorror.dat
and find myself on the campus of good old George Underwood University.
That aside, there are a lot of different bytecode engines out there, and there's no real reason not to be able to do a transform from one to another. Combined with the loadable opcode library facilities that were supposed to go into parrot there's no reason parrot shouldn't be able to handle other engine's bytecode -- the simplest way is to have a library of opcode functions that exactly match the functionality of the original bytecode and do a transform from the original bytecode to parrot bytecode, something that'll likely be mostly just an 8->32 bit word transform with a little bit of opnumber munging. This is something that's pretty easy for parrot and much less easy for most other VMs, since we have such a huge range of opcodes. Doing it on the JVM or .NET engine would require a more complex transform of the inbound bytecode. (Which isn't a bad thing, of course. It's just a thing)
More usefully, if you consider source code just an odd and somewhat densely packed bytecode, it means that allowing this means that all you need for parrot to properly dispatch source to a compiler is a bytecode loader that takes the source, compiles it, and then executes it. Want to handle ruby? Have a bytecode loader that dispatches all the .rb files to the ruby compiler and runs them. Tcl? Same thing. Heck, do it for all the languages that have registered compilers. (Though this would argue for a deferred, just-in-time library loader so you don't pay to load in all the language compilers and bytecode translation modules every time parrot's started, but that's not a big deal)
One of the things that was on the list 'o things to be put into parrot was fully annotatable bytecode. This would allow you, if you were wondering, to associate any number of sections of metadata to your bytecode, and within those metadata segments associate any data you like to individual positions in the bytecode.
Or, more simply, your code can say "given I'm at offset X from the start of my bytecode segment, what data of type Y is associated with it?"
This facility was going to go in to support proper error message handling, so compilers could associate source line numbers and source line text with the bytecode generated for that source -- this way you could have code like:
getmeta P1, [segment_offset,] "line_number" getmeta P2, [segment_offset,] "filename" getmeta P3, [segment_offset,] "source_text" getmeta P4, [segment_offset,] "column"
to get the line number, filename, source text, and column offset for either the current position or the position at the given offset (if it's passed in). Assuming, of course, that there are line_number, filename, source_text, and column metadata segments attached to the current bytecode. (This actually makes me think that exceptions should send along as part of themselves a thingie that represents the point in the bytecode that code can query for metadata. Ought to be able to do it for any continuation as well, for walking back up the call stack)
This is one of those things that I'd really prefer to be an object method thing, since I'm not picturing this ever being particularly speed-critical. (Yes, hell has frozen over -- I'm actually advocating an OO solution) The problem here is that when we need this information we likely don't have an object, or are operating at a level below the object system, so we're stuck welding the capabilities into the interpreter's bytecode system itself.
There was also going to be a corresponding setmetadata op to allow annotating a bytecode segment, though that would generally not be used by anything but a compiler module. (And odds are, given what we were seeing with folks writing compilers, that would mean adding directives to the assembler or PIR compiler, and defined annotations to the AST)
Would've been nicely useful for attaching all that info that compilers want to attach, and that runtimes like to have, without fluffing out the actual executed bytecode.
One of the things planned for parrot was a full notification system. That is, Parrot would support being able to register code that could be called when a PMC was read from, written to, destroyed, when a class instantiated a new PMC, or when an element was inserted into or removed from an aggregate. (Along with a few other things -- methods being added or removed from a class, the MMD tables changing, a PMC being invoked...) Once you register a notification for an action on a thing or class of thing, your notification code gets called whenever that action happens.
The whole point of notifications is to allow you a chance to get your code into the guts of parrot and get access to events you might not otherwise see, and do things in response to those events. There were three big drivers for this one.
First, there's some perl history. One of the common requests in perl 5 (well, not entirely uncommon, at least) is to be able to make the symbol table a tied variable. While there are a number of reasons people want to do this, the biggest is for debugging and monitoring -- they want to know when variables are added or accessed. This is not an unreasonable thing to want to do. Unfortunately it can't be done in perl 5.
Second, there's ruby. Ruby has a number of hooks wedged into it that allow you to install callback functions (well, methods) into a variety of places. These callbacks aren't special-cased -- they're just spots in the various classes that someone added a "Hey, it'd be nice to be notified if X happens" spot you could override. Useful things, and I wanted more.
Finally, there's the issue of instance variables (or slot variables, or attributes, or properties, depending on your language of choice). You know, those things that every object of a particular class has which every language calls by the name some other language uses for something completely different? Right, them. The problem with those is that, to be efficient, you really want to allocate them all as one big wad at the time your object is created, so you can access the instance variables as an array. That's great, until you find that you've added a new instance variable to a class with instantiated objects. With fully static, compiled languages that just can't happen, but with perl, python, and ruby, well... you can do all sorts of things to classes. When code adds (or removes) an instance variable from a class, you need to stop the world and rejig all the instantiated objects.
So... three general cases wherein something happens in the engine and some sort of watching code needs to do something. Three separate systems is awfully wasteful, and leaving people to bodge up an ad-hoc solution (or, more likely, a dozen or so ad-hoc solutions, if history is any guide) goes against one of Parrot's basic philosophies: "If everyone's going to reinvent a wheel, we might as well just provide the damn wheel as part of the stock system"
Hence the notification system. One global system that all this funnels through. If you do X, for any one of a myriad of values of X, then all the watching functions for X get called. The nice thing here is that you can have one unified system with a single interface so you don't need to call functions to register some callbacks, subclass classes for other callbacks, and set global symbols for other callbacks. Instead, one big system, one way to deal with it, fewer hassles to worry about.
Since we're integrating this all together I'll add that this is all was to done, on the callback side of things, with parrot's event handling system. That is, when a notification happened it just fired off an event (potentially a very high priority event) and probably put into the event queue for later processing. Some notifications would be handled immediately, either because they were very high priority (altering object structures, for example), or could be refused (I can't think of anything in particular here, but there's no reason that a callback couldn't decide that some internal notification wasn't allowed), or had to be finished before the notification could be processed (if you were monitoring an object for destruction) so notifications aren't just a set of internal events, but pretty darned close.
On the end of the world doing the monitoring there were going to be a number of different means of monitoring, depending on what was getting watched. A lot of it would be done with vtable method overrides, some with a set of permanent monitoring queues, and a handful with special-purpose checking code.
It would've been pretty darned swell. Ah, well, maybe next time.
Extensibility (to an extreme, perhaps) had always been one of the design goals of Parrot. This was on purpose -- if we learned nothing from history it's that people will take whatever you've got, break out the mutagens and gamma ray projectors, and have at it, because there's just no way you can anticipate everyone's needs in the future. So, rather than try and do that (we just looked at their needs in the present) we left a bunch of really big "WEDGE CLEVER THINGS IN HERE" spots into parrot.
Loadable opcode libraries are one of those spots.
A loadable opcode library is, basically, a library of opcode functions which are not built into parrot. The intention was that you could have a bunch of these sitting on disk as part of parrot's library, and load them on demand. (Either explicitly with code or, more likely, have parrot automatically load them for you based on metadata in the bytecode files) This dovetails nicely with the view that most of the opcode functions are just library functions with fast-path calling conventions. It also makes it possible to keep parrot's in-memory footprint as small as possible -- if you don't need the math or networking libraries, for example, you won't load them in, and don't pay the startup cost for them. (And yes, even if they're already in memory for other processes, there is a cost associated with loading them into your process)
What would you use them for?
Well, there were three big use cases.
First, there's the 'ancillary opcode/runtime library' case. The transcendental math ops were in this case -- they'd look like they were in the base set, but they'd only get loaded in if your bytecode used them, otherwise they wouldn't.
Second was the 'extras for languages/alternate bytecode loaders' case. That is, if you were a language compiler writer and you found that there were some operations that you needed that were essentially fundamental you could package them up into an opcode library and make sure code you emitted loaded the library up. (Again, probably using the metadata embedded in the bytecode files) This does require that the libraries be available to whoever gets your bytecode files, but that's not really a big deal -- this isn't going to be too common, and I don't think it's particularly onerous to have to install the, say, Prolog runtime libraries to run programs that have Prolog components. The same thing goes if you're writing an alternate bytecode loader -- it may well be a lot easier for the JVM/.NET/Z-machine bytecode loader to have a full library of JVM/.NET/Z-machine ops in an op library and just use those instead of recompiling the from the source bytecode to parrot's bytecode.
Third was the fast extension function case. This is one where your extension module explicitly declared that a number of its functions were actually opcode functions rather than traditional parrot functions. This was supposed to be fully supported. It wasn't supposed to be the general case, of course, since in general there's too much uncertainty around perl / python / ruby programs to do this as a regular optimization, but if you explicitly declare that a function is fixed at load time and can't be changed, well... that's OK.
Additionally, and very importantly, the list of opcodes was supposed to be per-sub. That is, rather than having one big table of opcode functions, mapping opcode number to function pointers, each subroutine would have its own table. (A table that might be shared, of course -- there's no point in having separate tables if they're all the same) This is a requirement for precompiled bytecode libraries to work, since it'd be really bad if you had a global opcode table but two separate bytecode libraries that each used separate extra opcode function libraries that mapped to the same opcode numbers. (That could be avoided by rewriting the bytecode when we load it, which we don't want to do, or by having a global opfunc registry, which we don't want either. Doing it this way is safest and easiest overall)
One of the things originally planned for parrot was the capability of overriding the functions attached to most of the opcodes at runtime, lexically scoped. That is, for any particular block (or, more likely, subroutine or method) your code could change the definition of, say, the tangent opcode, or the read opcode.
That sounds silly, doesn't it? I mean, to be able to change, at runtime, the basic components of the interpreter. That's insane!
Or not. First, because you're not allowed to override them all. The basics (any of the ops that can be JITted) are fixed, so you can't go changing how bsr or exit works. Second, remember, as far as parrot is concerned, an opcode is just a low level function with a fixed signature and fast calling scheme. That's it. Nothing at all fancy. "Opcodes" are just a combination of core engine functions and basic (but extendable) runtime library. (Since you do, after all, want your low-level runtime library functions to be as fast to call as absolutely necessary)
Sure, you could, if you want, use the more generic function call scheme to call those functions instead of making them callable using the opcode function mechanism, but that just means that the function calls are slower. (As parrot doesn't have a faster call scheme than the one opcode functions use. Even if you chopped bits out of the current calling conventions there's still more overhead, and you can't get rid of it) Somehow just doesn't make sense to me...
Anyway, on top of private opcode definitions (in those cases where you want to have your own ops) this allows for a lot of instrumentation to be applied. With the exception of that core set that you can't change, everything else is potentially up for grabs, and that means that if you do want to get in the way of how code executes (lexically!) you can. While this certainly isn't something you'd want to do a lot, actually being able to do it can come in handy. (Granted, in those situations you hope you're never in. Alas those are the ones you inevitably end up dealing with)
The cost is that you can't JIT those ops, nor can you inline their bodies in the switch or computed goto cores. This is generally an acceptable cost, since the ops which you do this with are ones that you don't execute often enough for the performance penalty to be offset by the potential utility of overriding the ops. (Especially since this will most often be done with runtime library functions, in which case they're probably not JITtable anyway, and even with the slowdown from the indirect low-level function call it's still faster than a parrot function call)