[PEAK] The path of model and storage

Wed Jul 28 14:44:32 EDT 2004

At 11:18 AM 7/28/04 -0700, Robert Brewer wrote:
>I wrote:
> > And if, in the process, you decided to make a cache container which:
> >
> > 1. Accepts any Python object (i.e. - don't have to subclass from
> > peak.something),
> > 2. Is thread-safe: avoiding "dict mutated while iterating" (probably
> > with a page-locking btree),
> > 3. Indexes on arbitrary keys, which are simple attributes of
> > the cached
> > objects (both unique and non-), and
> > 4. Is no more than 4 times as slow as a native dict,
>
>and Philip J. Eby replied:
> > #2 just ain't gonna happen.  Workspaces will not be shareable across
> > threads.  (Or more precisely, workspaces and the objects
> > provided by them will not include any protection against
> > simultaneous access by multiple threads.)
>
>*boggle* Are you just forcing the mutexing down to consumer code then?

Not if you don't need to share the objects across threads, no.  Keep in 
mind that "share the objects across threads" means that two threads need 
access to the *real* same object, as opposed to manipulating clones of the 
object.  If two threads each open their own workspace, backed by the same 
physical database, they'll just each see a different copy of that object, 
and conflicts will be resolved via the physical database's locking facilities.

So, there's absolutely no need to duplicate that logic, either at the 
consumer code level or in the framework, unless for some reason (that I 
can't presently fathom) you *want* to.

(Note that even if what you want is an in-memory database of some kind, you 
still just implement a workspace wrapper for it, and each thread's access 
to it is mediated by that thread's private workspace instance.  Conflict 
resolution is managed by the workspace and the DB, not the client.)

>  I can understand not shareable across processes, but *threads* -- wow.
>Bring on the ad-hockery. :O

Eh?  This is standard practice for persistence management systems in 
Python.  More to the point, it's standard practice for PEAK that *no* 
components are shared across threads in the general case.  It's a rare 
business application that has a use case for actual concurrency (as opposed 
to asynchrony).  IOW, if you need multiple threads to access data, you'll 
instantiate a workspace for each one.

PEAK only seeks to implement thread-safety for objects that *must* be 
shared across threads, like interfaces and classes.  Only applications that 
absolutely have to have concurrent access to an object should pay the price 
for it.  For most applications, it's sufficient to have thread-specific 
workspaces.