[PEAK] fact orientation

Sat Oct 15 19:49:58 EDT 2005

At 11:31 AM 10/15/2005 -0700, Jay Parlar wrote:
> > It's a big part of it.  Dynamic variables, fact orientation, and monkey
> > typing are too...
>
>Interesting. I'm going to have to look into fact orientation, having
>never seen that term before. Do you have any good links to it? All the
>ones I find on Google seem pretty heavy with "market speak", I can't
>figure out what they really mean.

If you don't mind wading through some fairly deep reading, try this:

     http://www.fcoim.nl/Literature_Article_FCO.html

It's an elaboration on the fundamental principles of FCO-IM: Fully 
Communication-Oriented Information Modelling.

In brief, it basically states that the purpose of data modelling is to 
represent *human communication about the world*, NOT to model the world itself.

This is actually quite a radical and revolutionary idea with respect to 
software design as mostly currently practiced.  If you think through the 
ramifications, it means that object orientation as we currently conceive it 
is broken, because it's trying to solve the wrong problem.  Computers are 
useful for performing computation and queries regarding *communicable facts 
about objects*, which is not the same thing as representing *actual objects*.

In the FCO-IM view of the world, then, there exist only:

1. lexical types (symbolic values like numbers, strings, dates, etc.)
2. non-lexical "nominalized" types (conceptual entities, like a Person)
3. dimensional types (unit values like feet, seconds, mass, etc.)

What's more, the "nominalized" types do not exist as values.  You can't 
just refer to a Person, you have to say in effect, "The Person named Jay", 
or "The Person with SS#123-45-6789".

That doesn't mean you can't refer to a Person object in code, mind you, by 
abstracting out what that reference mode or key is.  I'm just pointing out 
that fact orientation doesn't try to model the *implementation of a 
Person*.  It models *facts about them* -- just like a relational 
database.  Indeed, I'd say the reason that relational databases are still 
with us, and object databases mostly haven't panned out, is for precisely 
the reason that relational databases are fact-oriented, and therefore more 
flexible and extensible in this way.

So, fact orientation is incredibly more flexible because you can always add 
more kinds of facts about a person, but in the OO paradigm a class is 
closed, with a fixed set of behaviors and characteristics.  If you combine 
generic functions (which can extend classes with new behaviors) with fact 
orientation (which can extend concepts with new kinds of facts), you have a 
completely open-ended system with regard to extensibility.

In short, present-day OO techniques are far less flexible than functional 
decomposition and relational logic are, despite the fact that OO is 
supposed to be their successor.  Nice, eh?  :)

But what OO *does* offer is programmer convenience, a hierarchical 
shorthand for expressing commonalities, and a more brief notation for 
obtaining or manipulating certain kinds of facts.  The problem with both 
fact orientation and generic functions in their "raw" form is that you end 
up with a giant global namespace and the need to use function syntax (ala 
Lisp) to get at anything.  Instead of saying 'somePerson.foo', you would 
have to say 'get_foo(somePerson)'.  And then, six libraries might have a 
'foo', so you really need 'somelibrary1.get_foo(somePerson)'.  Ugh.

So, this is where my "monkey typing" concept comes into play: mapping these 
more flexible models back into the syntax and patterns we know and 
love.  We simply use ISomeLibrary(somePerson).foo for setting, getting, 
deleting, and method calls.  These adapters are of course just a collection 
of descriptors that return bound methods wrapping the appropriate generic 
functions, or perhaps the results of calling them.

Ideally, monkey typing would be able to piggyback on Guido's proposal for 
implementing type declarations, so that it wouldn't be necessary to deal 
with these matters in-line most of the time.  Monkeytyping adapters are by 
nature safe for re-adaptation, in that switching to another interface just 
unwraps and re-wraps the underlying object.

Anyway, with my recent implementation of the schema.Annotation class for 
Chandler, I've realized how to do monkey-typing for fact-oriented systems 
using the same approach.  In Chandler, you can now do things like the 
following, which is a snippet from the schema API doctest:

     >>> class Teacher(schema.Annotation):
     ...     schema.kindInfo(annotates=Person)   # annotate the "Person" type
     ...     certifications = schema.Sequence()
     ...     supervisor = schema.One(Person)

     >>> class TeachingCertificate(schema.Item):
     ...     subject = schema.One(schema.Text)
     ...     certified_teachers = schema.Sequence(
     ...         Teacher, inverse=Teacher.certifications
     ...     )

     >>> ProfMary = Teacher(Mary)  # Adapt Person to Teacher
     >>> gym = TeachingCertificate("gym", subject=u"Physical Education")
     >>> ProfMary.certifications = [gym]
     >>> list(ProfMary.certifications)
     [<TeachingCertificate ... gym ...>]

     >>> list(gym.certified_teachers)
     [Mary Quite Contrary]

The extra state isn't stored in the adapters, though, it's part of the 
underlying database.  Which means you can adapt as many times as you like 
and still get the same data:

     >>> list(Teacher(Mary).certifications)
     [<TeachingCertificate ... gym ...>]

Thus, the data model for "Person" is *open ended*.  Any number of Chandler 
plugins can define their own additional data to be kept, and those 
additional attribute names don't clash with those defined by "Person" 
itself.  This gets Chandler out of the OO rut in a way that plain OODB's 
can't handle.  (Of course, I'm sort of faking it because Chandler is 
actually built on an OODB, not a relational one.  So in truth you could 
pull the same trick on top of other Python OODB's as well.)

Up until now, I'd always had this idea as a general concept of what I 
wanted to do with the "SOAR" project (Simple Objects Accessed 
Relationally).  But one of the conceptual stumbling blocks for me with SOAR 
was that I always got down to the problem of how to determine what a 
database object's "type" was, and how to determine whether it implemented a 
particular "data interface".  (If you look at the old TransWarp code for 
"records", you'll see a lot of this stuff there.)

But what I've realized from working with Chandler is - you don't *need* for 
objects to have a "type", in the sense that they can have one and only one 
type.  If you're stuck in the OO paradigm, it seems this way, because how 
else can you determine what method implementations will be used to respond 
to a particular message?  But in a facts+functions world, this is 
silly.  You just define things' behavior with generic functions, which are 
perfectly capable of determining behavior based on *whatever facts you'd 
like*.

So, when you view this in the context of "modelling human communication 
about things", you quickly realize that determining a thing's "type" is 
just a kludge.  Business applications are usually all about enforcing rules 
based on facts, anyway.  The "is-a" relationship is just another kind of 
fact, and doesn't require you to boil every object down to just one 
type.  In a sense, you can have multiple-inheritance on a *per instance* 
basis, if you like.

The problem this was causing with trying to implement polymorphic database 
schemas in PEAK and TransWarp was that we always had the implicit 
assumption that an object was always of just one type, and thus we needed 
to know what class to make the ghost be when we loaded an item.

But a fully fact-oriented model using generic functions doesn't have to 
care, because *the object itself has neither behavior nor data*.  The 
object is simply a key to access a collection of underlying facts.  Once 
you've realized that, then there's no "problem" to solve any more, except 
that you can't really go around using isinstance() on things unless you 
generate classes on-the-fly to match an individual object's behavioral 
signature.  (Which we could possibly do, but it seems easier to me to just 
deal with things in monkeytyping terms, with no "real" object.)

In Chandler, then, an individual object really *does* have a single type, 
but we use annotations to widen selected types with "third-party" 
attributes.  But in the monkey/facts/functions (MFF?) model, this won't be 
the case.  There will be no "real" objects at all, in the old sense, or if 
there are for efficiency's sake, it's just a coincidence.  All you'll ever 
see are "interface instances", never the "real object".

Doing this stuff in Chandler would be too much of a rework for no immediate 
gains, because Chandler doesn't have generic functions and the underlying 
storage mechanism is still married to the O-O model.  So I don't have any 
plans to push for implementing the full MFF model there.

For my own stuff, though, I'd like to be able to avoid there ever being 
"real" objects, but that may not exactly be how it works out.  There are 
still places where it's useful to have traditional, non-queriable objects 
in an application, but these are usually also the same objects for which a 
schema isn't really necessary or useful to begin with.

Which brings us to an interesting point: the primary usefulness of 
single-type, message-passing, closed class O-O is in creating 
*solution-domain* abstractions  like GUI toolkits, event frameworks, 
service components, etc.  That is, OO as we know it today is really a 
low-level toolkit useful mainly for solving programmers' problems, not 
users' problems.  :)