[TransWarp] Requirements for the future peak.messaging package

Mon Aug 5 09:30:39 EDT 2002

At 10:39 AM 8/5/02 +0200, Ulrich Eck wrote:
>Hi Phillip,
>
>I've been working on a Messaging system for our Application that tries
>to implement Inter-Object-Messaging described in the book "Building
>Business Objects" aka Serial-Reusability.
>
>I used Pyro as Transport, but the implementation does not depend on
>it.
>
>There are synchronous and asynchronous messages, that are sent
>to a local queue-manager. the qm resolves the receiver via some
>kind of namingservice-interface (should be peak.naming ... but we're
>still working with Transwarp till the first peak release shows up)
>and sends the messages to the process, the receiving object is hosted.
>the receiving queue-manager receives the message into it's incoming
>queue and spawns a thread for message-handling.

Note that for peak.messaging, handling of messages is not its job.  It will 
be the responsibility of the receiver to request/extract messages from a 
queue on a poll/pull-type basis.  This is because the primary purpose of 
message queues in PEAK is to allow "temporal load balancing" by spreading 
out potentially "bursty" processing loads over longer periods of 
time.  (You'll notice that low-latency request/response messaging was down 
at the "helpful" priority.)

>I'ld like to share thoughts and work on this, because we need this
>feature for our work asap.
>
>Here are a few questions:
>- Have you thought about ObjectNaming .. e.g. what does an object key
>need to have and how is it resolved ??

Er, that depends on what you need.  peak.naming is designed to let you 
create your own naming providers, along a standardized interface, with 
mechanisms for translating objects to states and vice versa.  I'm not sure 
what you mean here by "object key", either.

>- Are all messages considered as remote .. or do you want to
>distinguish between local and remote invocation ??

Define "remote".  :)  But if you mean, will there be any optimizations for 
in-process messaging, then no, I don't see any at present.  The API will 
not distinguish between message destinations on the same machine or a 
different machine; it'll still be the same API.

> >* CRITICAL: Asynchronous "fire-and-forget" messaging; programs that produce
> >messages in volume are implicitly very busy and should not be made to wait
> >around for the messaging system to connect, etc.  This can be accomplished
> >either via socket to a local daemon, or via filesystem queues processed by
> >a daemon or thread.
>
>there needs to be queue persistence to ensure message-delivery on
>process failure i think.

Yep.  The default implementation of this will be filesystem queue 
injection, similar to Qmail "maildirs".  We already have an implementation 
of this in some of our older libraries that works well and is transactional.

>synchronous messaging should be possible as well i think .. then you
>can replace all remote invocation with only one communication-layer
>for inter-object/process/host communication.

Synchronous messaging is a non-requirement for the framework, 
IMO.  Synchronous just means that you wait for your response.

In general, our needs for "remote invocation" are low, which is why 
request/response messaging is low on our own priority list for actual 
implementations.  But we should be able to design the API's such that you 
can incorporate your mechanisms for such messaging into the framework.  We 
just don't plan to implement synchronous or request/response mechanisms at 
first; our need for asynchrony and transactions are much more important.

> >* CRITICAL: Transaction integration for injecting or removing messages from
> >a queue.  When a transaction is committed, all messages that were
> >provisionally sent should be "really" sent, and all messages that were
> >retrieved should be actually removed from the queue they came from.
>
>could be tricky to implement ...

Not at all.  We do this now in other code.  If you're using a filesystem or 
SQL-based message queue, as we do, it's almost trivial.

> >* CRITICAL: Strong multi-processing support; queueing mechanisms should
> >support simultaneous injection or retrieval by multiple processes on
> >multiple machines.
>
>this sounds like an event-service ..

This is just the requirement that queues be designed for multi-user 
access.  If I have three web server processes injecting messages and five 
processing daemons pulling messages out, they shouldn't stomp on each other 
or access uncommitted data, etc.  That's all I meant.

> >* CRITICAL: Messages must be secure from being read in transit by
> >unauthorized parties, and it should not be possible for unauthorized
> >parties to inject forged messages into the system.  (For filesystem-based
> >transit mechanisms, it suffices for adequate permissions to exist, and
> >private-key encryption and message digest mechanisms should suffice for
> >network transit.)
>
>we need this feature as well .. and we developed a ssl-transport for
>pyro that is included in v3.0b1. there is a connection-checker, that
>only accepts connections, that use signed certificates from a well
>known issuer.

That might be handy.  The requirement, however, explicitly doesn't need SSL 
or other public-key crypto.  Private-key crypto is sufficient as long as 
you have a secure way to share keys.  For example, rsync-over-ssh to 
deposit a file of private keys on servers, or keys shared in an SQL 
database that has a secure connection protocol.  It's crude, but adequate 
for clustering application servers and associated processes.  I don't think 
I'd want to use it for looser federations or inter-business messaging, but 
more advanced things can always be implemented later within the same framework.

Note that the requirements I'm laying out are mostly intended to 
select/limit what initial messaging implementations will be included in 
PEAK, based on the business-related needs Ty and I have for the framework.

>what are your plans to go ahead ?? .. have you thought about a basic
>layout yet ??

Nope; just exploring the territory at this point.  I'd like to have fewer 
objects to manage than JMS does to accomplish these things, but it's not 
yet clear if we will.

One thought that has occurred to me, is that it may be better to implement 
a transactional, Linda-style tuplespace (or Jini-style "JavaSpace") API 
instead of a messaging API as such.  Everything I've laid out as 
requirements can be cleanly accomplished within that concept, and our 
existing implementations of filesystem-and SQL-based queues could be 
extended to such an API in a straightforward manner.  But the model allows 
for much more powerful distributed/multi-process algorithms than a pure 
"messaging" model does.

>what do you think about using pyro for this issue ??

It's been a while since I looked at it last; lack of security bothered me, 
but if it now has SSL, it might be worth looking at again.

>I have searched the net and looked at most implementations of
>messaging-systems that are usable from python ... there are only
>a few pure-python things that are not uptodate mostly.

I presume you're talking about distributed-object systems; Apart from Elvin 
and Jabber, I really haven't seen *anything* for Python that I would call a 
"messaging system", at least not that I can recall at the moment.

If you're looking for pure-Python distributed objects, Fnorb might be worth 
checking out, however; it's now open-source.  I don't think it includes 
transport security, though.

>spread and elvin don't support pickling of objects and need compiled
>libraries. other messaging systems exist mostly written in java ..

Elvin's license doesn't allow commercial use, so it's pretty much worthless 
to me.  Spread is of great interest, however, because its ability to 
guarantee a message ordering among participants makes it perfect for 
implementing things like distributed lock managers.  This weekend I wrote a 
prototype distributed lock manager using the Spread module for Python; it 
was very straightforward to do so.  It would probably be equally useful for 
implementing a distributed tuplespace.  The lack of pickling isn't very 
important, though, as I intend for peak.messaging to support some kind of 
utility lookups for an encoding/decoding stack, to do things like zlib-ing, 
encryption, base64 encoding, or whatever else is needed to translate 
between message objects and byte streams for transport.

>p.s. I've read the first pages of your tutorial ... sounds good (and
>looks like you're coming ahead on your way to a peak release) ...

Yeah, finishing the tutorial is really the only thing holding up a release 
at this point.  Of course, bits of API or implementation may still change a 
bit as I write the tutorial, if I find that something is hard to explain 
and should be made clearer by changing the code.