[TransWarp] TableDM proof of concept, request for comments

Thu Oct 9 19:39:34 EDT 2003

At 02:13 PM 10/9/03 -0700, John Landahl wrote:
>On Thu, Oct 09, 2003 at 04:14:13PM -0400, Phillip J. Eby wrote:
>
> > * A PropertyName of 'db' is likely to collide; you should probably either
> > use 'TableDM.db', or better yet, simply Obtain(storage.ISQLConnection).
>
>I wasn't aware of the the last option (or didn't think of it), that's a
>really nice solution.

It works well if you group all your DM's as child components of a 
"database" service, e.g. fooDB.somethingDM, since then the 'fooDB' can 
simply offerAs=[storage.ISQLConnection] for its connection binding.

> > * Using 'None' as a return flag from 'writable()' is a bad idea if you 
> want
> > to support using NULL in the underlying database.  Usually, you'd map 
> NULLs
> > to/from None.  Perhaps NOT_FOUND or NOT_GIVEN would be a better choice
>
>I thought of the None/NULL problem, and considered adding a constant of
>some sort, but thought that any constant might conceivably collide with
>a real value.  How does NOT_FOUND/NOT_GIVEN avoid this?

Because no database I know has any sensible mapping for NOT_FOUND or 
NOT_GIVEN.  :)  Technically, I'm not aware of any that take None->NULL on 
query parameters, but most seem to convert NULL->None in output rows, so 
I'd prefer not to use None as a sentinel value.

> > * The code assumes '%s'-style placeholders are used by the underlying 
> RDBMS.
>
>This seems to hold for PostgreSQL and SQLite.  What are some
>alternatives I should be aware of?

See the Python DBAPI PEP for the list of official "paramstyle" 
values.  There are about five or six different styles, I think.  For 
example 'qmark' style is used by Gadfly, which consists of '?' as a 
placeholder.  You just put one ? for each positional parameter.

> > If I were going to make something like this part of PEAK, I'd probably 
> want
> > to define a formal interface for field mappers, and I'd probably want to
> > have a mechanism I could use to generate SQL templates ahead of time.  For
> > example, your code generates SELECT, INSERT, and UPDATE query templates
> > that are the *same* every time they're used.  Why not create bindings for
> > say, 'selectTemplate', 'updateTemplate', and 'insertTemplate'?  You could
> > then simply call 'self.db(self.whateverTemplate, values)' to perform the
> > operations.
>
>That's a nice idea, and I could have the bindings default to the values
>that are currently generated, only built one time at assembly.

Yep, and to determine the right parameter placeholders, you could use a 
binding that gets the paramstyle from the db connection's API module.  And 
this way, you only pay the cost of making that decision once for each 
DM.  (That's assuming we don't add some kind of parameter-generating method 
to ISQLConnection to take care of this aspect, of course.)

> > The other thing I'd probably do is make the mappings also implement
> > binding.IRecipe, so that you could say, e.g.:
>
>..
>
> > Of course, for this to work, these things would actually be factories
> > rather than the "real" field-mapping objects.  For example, the ones that
> > use DM's would need to accept a component key, instead of the actual
> > DM.  Then, when they were invoked by binding.Make they'd look up the DM
> > they were supposed to connect to.
>
>I'll have to think this one through when I have more than a minute...

It just means that for each mapping, you'd have an XYZMapping class and an 
XYZMappingImpl class.  The latter is basically the same as what you have 
now, while the former is just a factory that makes the latter.

> > And last, but very far from least, I'd want a way to allow
> > object-level queries to be mapped down to SQL queries.  But that's
> > probably a much bigger project in itself.  :)
>
>Hmm!  Doing something like Castor's OQL-to-SQL translator?
>http://castor.exolab.org/oql.html
>
>Something like this could be a very interesting addtion to PEAK.

I don't care about OQL per se.  Think of it more as QBE...  e.g.  "find me 
all bulletins with a category whose kind is 'foo'" might be expressible as:

Bulletin.where( category = Category.where(kind='foo') )

Or, find me people at least 18 years old...

Person.where( age = GE(18) )

...who have at least one dog:

Person.where( age = GE(18), pets = EXISTS( Pet.where(kind=PetKind.DOG) )

The idea is that as long as you have an object graph without one-way links, 
you can express any desired retrieval as a graph where edges express 
constraints.  If you have this capability, you don't really need OQL as 
such.  These sort of "selectors" (as I call them), don't really address 
more complex forms of data manipulation, but they cover nearly all of the 
input selection criteria needed for common business applications.  (Btw, I 
left out the boolean ops like AND(), OR(), and lots of other such 
items.  See peak.model.queries for a list of things I'd likely want to 
implement, although in a radically different way than they are done there, 
of course.)

The "hard" part is transforming these object-level graphs into relational 
structures.  In principle, mappings of the sort you've developed, should be 
capable of expressing transforms for queries, just as easily as they do for 
the actual data.  For example, your EntityMapping knows what the other DM 
is, so in principle it could convert the 'category' field in the 
Bulletin.where() example into a query on the CategoryDM to get a list of 
categories, and then apply the search to itself.  (Note, by the way, that 
you could in principle use just a single DM to manage all object types, 
since "selectors" include type information, and adaptation could be used to 
declare the mappings between those types and table information.)

But, the devil is in the details, or really, the optimization.  If you know 
the CategoryDM stores data in the same DB, you could get the DB to do the 
join for you.  But how do you know that?  At this point, I'm not clear on 
the full set of abstractions needed to do this cleanly.  For example, there 
needs to be some idea of a "database instance" that implements a "data 
model" or "schema", in which "table" objects exist.  Again, for very 
specific circumstances, one can create reasonable automatic SQL generation, 
but it seems difficult in the general case, without having objects that 
model relational concepts like joins.  Also, as in Castor, there would need 
to be DB-specific SQL generation based on that abstract relational model.

It's been a long time since I looked at Castor's implementation, and 
probably should do so again to get some ideas on what our SQL abstraction 
could/should look like.