[PEAK] DM refactoring and asynchrony (was Re: PEAK and Twisted, revisited)

Wed Apr 21 23:27:09 EDT 2004

At 02:57 AM 4/20/04 -0400, Stephen Waterbury wrote:
>The OO/DM view vs. the FO/EC view seems to me a classical
>dualism -- each of them valid/optimal for different use-cases,
>operating on different views of the same underlying shared
>data/model/state.  I'm guessing you agree -- am I right?

Maybe.  :)

>Your description of "FO" (of which I confess my ignorance :)

That's because I coined the term.  :)  I'm using it to refer to the common 
elements of ORM (Object Role Modelling, not Object-Relational Mapping) and 
FCO-IM (Fully Communication-Oriented Information Modelling).

>both here and further on reminds me strongly of the formalism
>of Description Logics, which provides the rigorous foundation
>for the W3C Semantic Web (SW) experiments, RDFS, OWL, triples,
>etc.  This connection is of great interest to me, and I
>think some of the recent work in those areas provide some
>potential ingredients for defining and managing FO domains.

Formally, I would say that FO is roughly equivalent to the "domain 
relational calculus", except that certain constraints defined by ORM and 
FCO-IM require recursion, which I don't believe is technically possible in 
the domain relational calculus.

Anyway, this is the same sort of first-order predicate logic that is the 
basis of relational database theory, and logic languages like Prolog.

>I'd like to help work on synthesizing the OO and FO approaches
>at the MOF level -- of course, that too will require a lot of
>work to make it really useful, and it's best to do it while
>prototyping more concrete implementations at the level I think
>you're talking about, and which I am doing somewhat more
>primitively in my app.

FYI, there's a group doing a business rules submission to OMG that's based 
on a FO metamodel, and it includes a self-hosting metametamodel.  I haven't 
read enough of it to really understand it, though.  Personally, I'm not 
sure how relevant a direct MOF mapping will be.  (Note that FO encompasses 
n-ary relationships, whereas MOF (at least in the 1.x series) encompasses 
only binary ones.  Also, FO requires a "reference scheme" for object types, 
which would have to be represented as tagged values in the MOF.  Indeed, 
there are probably quite a few things that would need to be expressed as 
tagged values in a MOF mapping.)

>I agree.  In my application, I'm including a "Triples" (a la OWL
>and SW -- read "facts" ;) table in which domain objects known
>to my application participate in facts (triples) with
>properties that could be defined either locally or remotely,
>and at all levels, and Triples of which either the subject or
>predicate is a domain object could be joined with domain object
>tables to provide facts about the domain objects.

Keep in mind that FO isn't limited to triples; indeed it's both possible 
and useful to have unary and binary fact types as well, or even quaternary 
fact types.

Of course, technically one can emulate all those things with ternary or 
even binary fact types, as long as one "objectifies nested fact types", but 
this isn't the best way to deal with things at the conceptual level if 
humans don't think of the nested fact type as an object.

For example, if you have a reservations system, you might reserve a 
resource at a time, for a person.  You can model this as a single ternary 
fact type, "Resource is reserved for Person at Time", *or* as three binary 
fact types, e.g.:  "Reservation reserves Resource", "Reservation is for 
Person", and "Reservation is at time".   Which way is best at the 
*conceptual* level depends entirely on whether people want to be able to 
have a reservation number and track it that way.  That is, whether the 
reservation is really an object in their mind.

>I haven't done an implementation yet, but my idea is to
>allow the Triples to span any number of metalevels,
>as I think you are also proposing.

I don't think so.  What I'm doing is statically typed, and the "object 
holes" in a fact type are for values such as strings, numbers, etc, which 
identify objects of specific types (albeit polymorphically).

>Of course, this means
>that not all sets of facts would be "decidable" in a formal
>sense, but my application wouldn't be trying to find *all*
>"entailments" (inferences), anyway.  Also, I suppose it might
>be possible to define meta-layers within which views would be
>decidable, if desired ... ;)

I'm not anywhere near that ambitious.  Think "relational DB with a more 
convenient interface" and you'll be closer to my goals.  :)

BTW, I think your "'lingua franca' between different levels of a
>system" is the same concept as the triples/facts across any number
>of meta-levels to which I aluded above.  Right?

No.  I just meant mainly an intermediate format between an OO perspective 
and a relational one.  Or rather, an authoritative conceptual schema, from 
which the relational and OO perspectives are mapped.

>>The hard part of the design, that I'm chewing on in my spare time, is how 
>>to create a Pythonic FO notation, that won't end up looking like Prolog 
>>or Lisp! ...
>
>You may have already seen this, but I like "N3" notation as a
>clean lexical format for triples (sort of the triples "rst". ;)
>( See http://www.w3.org/DesignIssues/Notation3.html )
>There are some Python N3 parsers, including Dan Connolly's cwm.
>N3 might not be the basis for the ideal Pythonic FO notation,
>but it might provide some ideas.  A common Python API for the
>extent of the shared semantics between FO and N3/OWL would be
>very desirable from my point of view.

It's not a notation for facts that I need, but a notation for describing 
fact types, constraints, derivation rules, and so on.  In any event, fact 
types may be unary or binary, with most of an application's fact types 
usually being binary.

>>But, unlike the peak.query of today, fact retrieval will be able to be 
>>asynchronous.  That is, you'll be able to "subscribe" to a query, or 
>>yield a task's execution until a new fact is asserted.  Even if your 
>>application isn't doing event-driven I/O or using a reactor loop, you 
>>could use these subscriptions to e.g. automatically raise an error when a 
>>constraint is violated.  (In practice, the EditingContext will do this 
>>when you ask that it commit your changes to its parent EditingContext, if 
>>any.)  If you're writing a non-web GUI, you'd more likely subscribe to 
>>such events in order to display status bar text or highlight an input error.
>
>Sure, but this stuff is just "framework sugar".

No, it's actually pretty critical.  To display meaningful messages to an 
end user, it's not sufficient to wait for the relational DB to throw an 
integrity error.  The issue needs to be presentable to the end user, 
whether it's a GUI or a web app.  And, many applications need other 
"rule-generated messages" to appear in an advisory capacity (i.e. just 
warn, not prevent commit) as well.

>Well, they *are* meta in a DL sense, but it's only an important distinction
>if you are concerned about the "decidability" of collections of facts,
>which I think is only a problem if you're doing some aggressive
>inferencing that's trying to compute some significant fraction of "all"
>entailments of the set, as opposed to just trying to verify some rules.

Right, as long as there are no subscription/derivation cycles, decidability 
(aka algorithm termination) is not an issue.  I may try to make it 
physically impossible to create such a cycle.

>With facts represented a triples, the join would be on subject
>or object id.  In OWL, that would be a URI, but I plan to generalize
>that to include any suitable name.  (Some of them would only have
>local significance.)

Again, this doesn't have anything to do with triples.  Triples as you know 
them can be modelled in this space, and in principle you can implement an 
FO model with only triples, but in practice it would be both a hideous 
misuse of a database, and it would not provide any mapping flexibility.

A given FO fact type can be mapped to any of a large number of relational 
representations, which is important because such representations may need 
to be used, either for speed or for backward-compatibility with existing 
applications.  However, the "conceptual facts" expressed do not change, and 
thus program code is protected from knowing such implementation details.

>>... Hm.  This is sounding simpler than I thought it would be, at least in 
>>principle.  But I still need to:
>>* Work out the metamodel for fact types, how to map from constraints to 
>>derived fact types that signal the violation of those constraints, and a 
>>Pythonic notation for defining a fact-oriented domain model.  (For 
>>example, to define a constraint on a fact type or set of fact types, one 
>>must have a way to reference those types in code.)
>
>I would think at the MOF level facts and their types would be
>structurally isomorphic to objects.

No.  "Car #27 is booked to Customer #55 at 12:15pm tomorrow" is a 
fact.  "Car #27", "Customer #55", and "12:15pm tomorrow" are all references 
to objects.  In FO, objects do not have attributes.  Rather, we record 
facts *about objects*.

The correct mapping for a binary fact type in a MOF model is either an 
association or an attribute (such as when one of the object types was 
conceptually a value, such as a timestamp, count, name, etc.).  Classes 
would also have an attribute designated as their reference scheme (e.g. 
CarNr, CustomerNr).  Unary fact types would map to boolean attributes.

Ternary and higher-arity fact types don't have a MOF mapping without 
extending to higher-arity associations.

(You'll see here that the model of n-ary fact types is thus considerably 
more uniform and general than MOF, since it takes so many different MOF 
concepts to represent what's just an arity in the FO view.  That's what 
makes it so interesting to me, along with the very comprehensive model of 
what kinds of constraints are useful in real-world modelling, and 
techniques that come from the ORM and FCO-IM methodologies.)

One thought that does occur to me, however, is that if I simply added n-ary 
associations to peak.model, I might be able to have my object cake and eat 
facts too.  :)  The main thing I'm concerned about is that the current 
peak.model implementation would need a major overhaul to do something like 
that.  Probably I'll have to start the "new style" model in another 
package, and then replace the old one later, in sort of the way that Zope 3 
technology is continually being backported into Zope 2.