[PEAK] DM refactoring and asynchrony (was Re: PEAK and Twisted, revisited)

Thu Apr 22 03:48:32 EDT 2004

Phillip J. Eby wrote:
> At 02:57 AM 4/20/04 -0400, Stephen Waterbury wrote:
> 
>> The OO/DM view vs. the FO/EC view seems to me a classical
>> dualism -- each of them valid/optimal for different use-cases,
>> operating on different views of the same underlying shared
>> data/model/state.  I'm guessing you agree -- am I right?
> 
> Maybe.  :)

Well, it made for a fun conversation anyway.  ;)

>> Your description of "FO" ... reminds me strongly of the formalism
>> of Description Logics ... [blah blah] ...
> 
> Formally, I would say that FO is roughly equivalent to the "domain 
> relational calculus", except that certain constraints defined by ORM and 
> FCO-IM require recursion, which I don't believe is technically possible 
> in the domain relational calculus.

That's intriguing -- I have no idea.

> Anyway, this is the same sort of first-order predicate logic that is the 
> basis of relational database theory, and logic languages like Prolog.

And also the basis of Description Logics.  A google on domain
relational calculus and description logics turned up this
paper:  http://www.upriss.org.uk/papers/dl99.pdf  ... which (upon a
quick browsing) looks like it has some interesting insights on
their relationship.  I'll have to read it when I'm awake.  :)

>> I'd like to help work on synthesizing the OO and FO approaches
>> at the MOF level ....
> 
> FYI, there's a group doing a business rules submission to OMG that's 
> based on a FO metamodel, and it includes a self-hosting metametamodel.

I'll have to look at that.  Wonder if it's related to the
"RuleML" stuff.

> I haven't read enough of it to really understand it, though.  
> Personally, I'm not sure how relevant a direct MOF mapping will be.

Me neither.  Actually, when I say "MOF", I confess to using the
term rather loosely -- I really just mean some kind of Super
Meta Facility (probably more general than the OMG's MOF ;).
[warning:  danger of high winds due to excessive hand-waving ...]

> (Note that FO encompasses n-ary relationships, whereas MOF (at least in 
> the 1.x series) encompasses only binary ones.  Also, FO requires a 
> "reference scheme" for object types, which would have to be represented 
> as tagged values in the MOF.  Indeed, there are probably quite a few 
> things that would need to be expressed as tagged values in a MOF mapping.)

Full disclosure:  I'm still learning about MOF, whereas you've
implemented it.  And my experiments in that area will be more
hackish, I guarantee.  :)

>> ...  In my application, I'm including a "Triples" (a la OWL
>> and SW -- read "facts" ;) table in which domain objects known
>> to my application participate in facts (triples) with
>> properties that could be defined either locally or remotely,
>> and at all levels, and Triples of which either the subject or
>> predicate is a domain object could be joined with domain object
>> tables to provide facts about the domain objects.
> 
> Keep in mind that FO isn't limited to triples; indeed it's both possible 
> and useful to have unary and binary fact types as well, or even 
> quaternary fact types.

My understanding of "triples" (which are essentially just propositions)
is that they can be used either singly or in combinations to represent
any fact (but not rules, unless one or more of the elements in the
triple is/are allowed to be variable[s]).

> Of course, technically one can emulate all those things with ternary or 
> even binary fact types, as long as one "objectifies nested fact types", 
> but this isn't the best way to deal with things at the conceptual level 
> if humans don't think of the nested fact type as an object.

True.  I wouldn't suggest the triples representation is always
the most human-comprehensible representation, and certainly not
the most computationally optimal one either -- I regard it as
the extreme of flexibility at the cost of performance.  In
general, very verbose compared to an n-ary representation.

> For example, if you have a reservations system, you might reserve a 
> resource at a time, for a person.  You can model this as a single 
> ternary fact type, "Resource is reserved for Person at Time", *or* as 
> three binary fact types, e.g.:  "Reservation reserves Resource", 
> "Reservation is for Person", and "Reservation is at time".

Actually, this confirms what I had suspected:  what
you are calling binary fact types are what in semantic web
parlance are referred to as triples (subject-verb-predicate,
aka subject-predicate-object, etc.), although the particulars
of how they map depends on how you choose your verbs and
nouns.  Some subtle flavorings occur in translation of
natural language to formal logic.  :)

In the classical OWL ontology example, the triples defining a
vineyards ontology ("vin") are (in n3 notation):

vin:ProductionArea rdf:type rdfs:Class.
vin:Country rdfs:subClassOf vin:ProductionArea.
vin:Region rdfs:subClassOf vin:ProductionArea.
vin:Vineyard rdfs:subClassOf vin:ProductionArea.

... which I think are what you would call binary facts.
(Well, meta-facts in this case, but the same structure
applies at the instance level.)

> Which way 
> is best at the *conceptual* level depends entirely on whether people 
> want to be able to have a reservation number and track it that way.  
> That is, whether the reservation is really an object in their mind.

Exactly.

>> Of course, this means
>> that not all sets of facts would be "decidable" in a formal
>> sense, but my application wouldn't be trying to find *all*
>> "entailments" (inferences), anyway.  Also, I suppose it might
>> be possible to define meta-layers within which views would be
>> decidable, if desired ... ;)
> 
> I'm not anywhere near that ambitious.  Think "relational DB with a more 
> convenient interface" and you'll be closer to my goals.  :)

Well I *did* get carried away there.  ;)  As I say, decidability
issues are out of scope for my current application.  But I am
interested in using triples and meta-triples to extend my domain
objects and enable "late binding" support for the extensions in
the database without altering its SQL schema.  Of course, that may
be quite insane.  Time will tell.  :)

>> BTW, I think your "'lingua franca' between different levels of a
>> system" is the same concept as the triples/facts across any number
>> of meta-levels to which I aluded above.  Right?
> 
> No.  I just meant mainly an intermediate format between an OO 
> perspective and a relational one.  Or rather, an authoritative 
> conceptual schema, from which the relational and OO perspectives are 
> mapped.

Now see?  That's a lot more than just "relational DB with a more
convenient interface"!  (At least its seems more to me. :)

>> You may have already seen this, but I like "N3" notation ...
> 
> It's not a notation for facts that I need, but a notation for describing 
> fact types, constraints, derivation rules, and so on.  In any event, 
> fact types may be unary or binary, with most of an application's fact 
> types usually being binary.

As I noted above, triples == binary facts (maybe unary, too, if
"context" is made explicit ...).  Fact types seem meta to me.
(A "fact ontology", perhaps?)  Constraints to me sounds like rules
(triples with "variables").  Not sure about derivation rules --
maybe just another rule, or meta-rule?  Ever the minimalist.  :)

Not that it wouldn't be useful to have a "fact language", but its
atoms could be modeled explicitly in a fact ontology.

>>> But, unlike the peak.query of today, fact retrieval will be able to 
>>> be asynchronous.  That is, you'll be able to "subscribe" to a query, 
>>> or yield a task's execution until a new fact is asserted.  [and other
>>> nice examples ...]
>>
>> Sure, but this stuff is just "framework sugar".
> 
> No, it's actually pretty critical.  To display meaningful messages to an 
> end user, it's not sufficient to wait for the relational DB to throw an 
> integrity error.  The issue needs to be presentable to the end user, 
> whether it's a GUI or a web app.  And, many applications need other 
> "rule-generated messages" to appear in an advisory capacity (i.e. just 
> warn, not prevent commit) as well.

Yes, I agree with that.  I was being flippant about "framework
sugar".  The whole point of a framework is to provide standardized,
re-usable, pre-built ways of implementing such aspects -- very
useful.

> ... as long as there are no subscription/derivation cycles, 
> decidability (aka algorithm termination) is not an issue.  I may try to 
> make it physically impossible to create such a cycle.

That definitely won't be an issue if you don't use any
meta-facts (which is exactly what OWL-DL does for that
same reason).

>> With facts represented a triples, the join would be on subject
>> or object id.  In OWL, that would be a URI, but I plan to generalize
>> that to include any suitable name.  (Some of them would only have
>> local significance.)
> 
> Again, this doesn't have anything to do with triples.  Triples as you 
> know them can be modelled in this space, and in principle you can 
> implement an FO model with only triples, but in practice it would be 
> both a hideous misuse of a database, and it would not provide any 
> mapping flexibility.

Yes, it would be rather hideous.  I only intend to use triples
for extensions and "late-bound" facts.  So it's important for my
app to have a good base ontology that doesn't need to be extended
very much! :).

> A given FO fact type can be mapped to any of a large number of 
> relational representations, which is important because such 
> representations may need to be used, either for speed or for 
> backward-compatibility with existing applications.  However, the 
> "conceptual facts" expressed do not change, and thus program code is 
> protected from knowing such implementation details.

This may be at the root of our difference in perspective:
my application doesn't need a large number of relational [DB]
representations, but it needs a great deal of flexibility in
dealing with many flavors of non-database, complex, highly
normalized (CAD/CAM/CAE data tends to be insanely normalized and
RDBMS-hostile) models, some portions of which will be mapped
into a combination of more denormalized, DB-friendly, n-ary
relational structures and triples.

> The correct mapping for a binary fact type in a MOF model is either an 
> association or an attribute (such as when one of the object types was 
> conceptually a value, such as a timestamp, count, name, etc.).  Classes 
> would also have an attribute designated as their reference scheme (e.g. 
> CarNr, CustomerNr).  Unary fact types would map to boolean attributes.

Again, that depends on the translation of the natural language facts
into formal representations.  An object-attribute relationship is
equivalent to a triple in Semantic Web world.

> Ternary and higher-arity fact types don't have a MOF mapping without 
> extending to higher-arity associations.

I'll have to study MOF more before I can intelligently discuss
that, but there has been some work done on mapping between UML
and OWL by some people I work with in the "CAX" data standards
world.

> One thought that does occur to me ... is that if I simply added 
> n-ary associations to peak.model, I might be able to have my object cake 
> and eat facts too.  :)  The main thing I'm concerned about is that the 
> current peak.model implementation would need a major overhaul to do 
> something like that.  Probably I'll have to start the "new style" model 
> in another package, and then replace the old one later, in sort of the 
> way that Zope 3 technology is continually being backported into Zope 2.

That sounds like an approach worthy of experimenting with.
Thanks for the discussion; very thought-provoking.

Cheers,
Steve