[TransWarp] Basic "storage jar" design
Roché Compaan
roche at upfrontsystems.co.za
Sun Jun 30 17:05:12 EDT 2002
On Sun, 2002-06-30 at 16:43, Phillip J. Eby wrote:
> At 04:11 PM 6/30/02 +0200, Roché Compaan wrote:
> >Hi Phillip
> >
> >I didn't understand enough of your previous post "PEAK persistence based
> >on ZODB4, continued" because my brain exploded every second paragraph. I
> >wasn't too concerned because it seemed that you and yourself first
> >needed to talk through it :)
>
> Yes, I've started using letters to the mailing list as a substitute for
> talking to Ty to work out my ideas, when he's not readily available. :)
>
> Not too long ago, I found out there's actually a name for the way I do my
> thinking; it's called "Image Streaming". The idea is that you dump out the
> contents of your brain to another human being with the intent of having
> them understand the ideas you're putting forth, and it frees you from
> having to hold on tightly to any one idea as you go. It also creates a
> kind of feedback loop that helps you refine and clarify the initially vague
> intuitive concepts that come to mind. Anyway, I've been doing it for many
> many years without having a name for it. It's only been in the last month
> or so, however, that I've realized I can do a form of it by writing down
> the ideas in the form of a letter or proposal or whatever to someone else. :)
I won't try that technique publicly just yet because I will probably
"stream" a lot of white space. For now I'll use my limited mental
bandwidth for incoming streams, but don't be surprised if I stream a
whole lotta question marks back :)
Seriously though, I use a similar technique when trying to understand
other people's ideas. I just dump any questions that comes to mind and
quite often the answer lies in the question itself. Or I just try
formulate the original idea in my own words.
> >What will a query jar do? I assume they will remember query results to
> >prevent re-querying the underlying database?
>
> They do several things, none of which I really ever explained thoroughly. :)
>
> Think about a two-way association between objects - say your
> person/department example. If the person table has a foreign key
> reference [1->1] to department, then department has an implicit [1->n]
> relationship to person. A query jar could be used to represent this
> inverse relationship, so that when a department object's state is loaded, a
> "ghost" from the query jar (with the department ID as its oid) is placed as
> the "people" attribute of the loaded department object. Any attempt to
> *use* this people attribute will cause its state to be loaded from the
> query jar - a list of ghosts of person objects, retrieved by a query
> against the persons table. Of course, since you're querying the persons
> table, you may as well pass that state through to 'preloadState()' on the
> person jar, so the person jar won't reload that data when you access one of
> the ghosts. (Of course, if the state is loaded they won't be ghosts, but
> anyway...)
Awesome! Awesome! Awesome!
> > > * oidFor(ob) -- Called by save() operations of other jars to get foreign
> > > key values for objects referenced in their states. Implementation: if
> > > ob._p_jar is self, return ob._p_oid, unless _p_oid is None, in which case
> > > save the object using oid = ob._p_oid = self.new(ob), and return the
> > > oid. If the _p_jar is NOT self, return self.thunk(ob) to try to translate
> > > the reference or create a stub.
> >
> >So if I need to save an instance of "Person" which references an
> >instance of "Deparment" I can call "oidFor(ADepartment)" on the
> >DepartmentJar to get the department's id. When will _p_jar not be self?
> >Won't all objects returned by the DepartmentJar have their _p_jar set to
> >the DepartmentJar?
>
> Yes, *but* it is not necessarily the case that you'll be putting a
> department object from *that* department jar there. Suppose you were
> working in an RDBMS, but the source of department existence was an LDAP
> directory. You might set aPerson.department =
> aDepartmentFromAnLDAPJar. When saving aPerson, you ask the
> SQLDepartmentJar for an oid, and it has to create a thunk or stub reference
> in the SQL database that is referenceable as a department key, but has some
> kind of linkage to the LDAP-based department info. That's what the thunk()
> method is for. As I noted, it's not something you'll support often, but Ty
> and I have multiple apps which do this sort of cross-DB referencing for one
> or two object types.
I thought it had something to do with cross-DB referencing. But is the
SQLDepartmentJar really necessary? Can't PersonJar (SQL-based) just ask
DepartmentJar (LDAP-based) for an oid? PersonJar doesn't really have to
know that DepartmentJar is LDAP-based, it is only asking DepartmentJar
for an oid.
> >So if an object's state is set to "loaded" by __setstate__ you still
> >have an empty instance. The only difference being that it's state is
> >set. When does data retrieval happen for this instance, especially
> >since its "loaded" state will prevent it. What am I missing?
>
> If the state is loaded, it's not a ghost, and it has everything it needs.
> >"__getitem__" returns an object from the cache or a ghost if its not in
> >the cache.
>
> Yes. preloadState() is similar, except that it *may* return a non-ghost,
> fully loaded object.
Then I think one can actually drop the method "ghost" and just call
preloadState(oid, state="ghost").
> > > * new(ob) -- save new object 'ob' and return its oid (by generating it or
> > > extracting it from state)
> >
> >What about foreign key constraints in the underlying db? Not that I
> >really use them - I think it is the application's responsibility to
> >govern relationships between objects.
>
> I presume you're talking about ensuring that the referenced object exists
> before it's referred to? That's actually handled by way of
> 'oidFor()'. Think about it. When you save the state for 'aPerson', it has
> to get the 'oidFor()' of all its foreign key references before it can do an
> SQL "UPDATE" to save them. If any of them need new ID's, oidFor() will
> cause them to be created and saved *before* the update can point the
> foreign key to them. Thus, relational integrity is guaranteed by the
> normal operation of the framework, which is just beautiful, IMHO. :)
It is. I've seen very few persistence frameworks that don't get bitched
around by foreign keys - that's why I try to avoid them.
> >For those who don't know, "Jar" comes straight from your fridge. When
> >you want to preserve food, you pickle it and put it in a Jar. The same
> >goes for objects that you want to persist: you pickle it and put it in a
> >Jar. Sometimes it helps to explain what was obvious once an has since
> >been forgotten.
>
> Actually, "storage jars" for me is a reference to a Monty Python
> sketch! But I did start with the term "jar" since the ZODB persistence
> framework has the _p_jar concept, which does come from "pickle jars" as
> used by Jim Fulton, which came from Python pickle, which I think came from
> some other language's notion of pickling. The politically correct term for
> a jar is now a "persistent data manager", as expressed by the
> IPersistentDataManager interface and lots of references to "dm's" and "data
> manager" in the C and Python code of ZODB 4.
I'm sure between StarTrek and Monty Python we will find names galore :)
Btw, where can I look if I want to have a look at the ZODB 4 code? I
have a Zope3 checkout and I noticed the interfaces you mentioned are in
there. Is there any other place I should be looking as well?
> But I like "storage jars" better, at least as a working term. I'm not sure
> it really belongs in the businesslike terminology of PEAK, and we might
> actually be better off calling them "Racks", as they are very close in
> concept and function to the Racks in ZPatterns. The main difference is
> that there were no "alternate key" racks or "query" racks in ZPatterns, at
> least as a promoted concept. Which isn't to say that nobody ever
> implemented query or alternate key racks; I'm sure they did. There just
> weren't names for the concepts.
Well maybe "DataManager" is not loaded metaphorically but it is
intuitively understandable (which was a big barrier for ZPatterns).
> Anyway, final terminology can wait a bit, since there's no code as yet.
But let's not neglect it :)
--
Roché Compaan
Upfront Systems http://www.upfrontsystems.co.za
More information about the PEAK
mailing list