[TransWarp] making progress - examples in cvs
Phillip J. Eby
pje at telecommunity.com
Fri Feb 21 22:06:49 EST 2003
At 11:38 PM 2/21/03 +0100, Ulrich Eck wrote:
>hi out there,
>
>for any interested people an update on our work:
>
>in our mission to build a scriptable user-management
>(ldap-accounts + imap-mailboxes) we make good progress using
>the latest and greatest peak-capabilities.
>
>you can have a look at our code at:
>http://cvs.net-labs.de -> apps -> acmgr
Wow. Fantastic! I'm enthused at what you've accomplished, but I have a
few comments about some things that I'd like to "un-recommend" to anyone
who is looking at the code for examples, because they should not be
emulated. I also have some things I want to single out for praise as good
examples; these are mixed here in no real order...
* model.Package is deprecated; I'd suggest getting rid of the IMAPModel and
just putting all its classes into the module, and using the module itself
in place of the IMAPModel class.
* The acmgr.connection.imapmodel.MessageHeaderField class has some
issues. First, you're breaking persistence notification by modifying a
stored mutable, unless the mutable itself is Persistent. Second, the
proper place to perform format conversions of this sort is in the DM
_load() method, either by performing the computation or inserting a
LazyLoader. Third, you're breaking validation/observation capabilities by
bypassing _link()/_unlink() - this might be okay if you know that the
'data' attribute is implementing all the validation and observation you
need. Fourth, if you *really* need to redefine how an object stores its
bindings, you can always override '_setBinding()' in the element
class. All in all, the MessageHeaderField class is very bad for separation
of implementation issues from the domain model.
* Most of your features don't declare a 'referencedType'. *This will break
soon*. I expect to add a validation hook to the StructuralFeature._link()
method soon, that will expect to call a 'mdl_normalize()' class method on
the referenced type, to normalize the value of a supplied input
object. This will let you do things like:
class OddNumber(Integer):
def mdl_normalize(klass, value):
if (value % 1):
return value
raise ValueError("%d isn't odd" % value)
mdl_normalize = classmethod(mdl_normalize)
and then declare an attribute to have a type of 'OddNumber'. The catch is
that each feature *must* have a type object, or else define its own
normalize() method. So just an early warning.
* The 'parts' collection attribute is pretty cool; a nice use of verb
exporting. I would note, however, that it is not necessary to use the
'%(initCap)s' format for the verbs if you don't want to. In fact, since
here you are only using the verbs for one feature, you could do:
newVerbs = Items(newText='addTextPart',
newImage='addImagePart',
newAudio='addAudioPart',
newBase='addBasePart',
)
Or, if you wanted to reuse this for other MIME-part collections, you might
declare use '%(singular.initCap)s', and then define 'singular = "part"' in
the feature. This would result in you getting the same method names as
shown above. (Using '%(singular.upper)s' would result in 'addTextPART',
etc. See peak.model.method_exporter.nameMapping for details.)
Anyway, all of this is really only useful if you need to reuse the method
templates for another collection. In your app, the effort is wasted except
as a learning experiment in using MethodExporter techniques. Note that you
could have, with less effort, simply written 'addTextPart()',
'addImagePart()', etc. methods in the body of the element class, using
plain ol' Python and having them call 'self.addParts(part)' to do the
actual addition, or 'self.__class__.parts.add(part)' if you wanted to
bypass the exported 'addParts()' method for some reason.
* I recommend the idiom 'isChangeable = False' for clarity, in place of
using '0' for false. I'm also curious whether the status fields should
really be represented as derived attributes or methods instead of as
read-only attributes, but I don't know/remember enough about IMAP to give
an informed opinion.
* Cool example of the use of the EntityDM.preloadState() in
acmgr.connection.imap.IMAPFolderACLQuery! For those of you who don't
know what that is/what it's for, 'preloadState()' allows you to speed
loading of objects when a QueryDM is loading data that could be used as the
object states. In this case, it's being used to ensure that the data an
ACL query loads, is used to create and cache the ACL objects in the ACL DM.
* I would like to caution against the excessive use of '../foo/bar' type
name paths. I prefer to keep these infrequent for components that are
potentially more separable and reusable, as most of the connection.imap
module is. I'd suggest that instead you use 'connection =
binding.bindTo(IIMAPConnection)', for example, so that you simply connect
to the nearest available IMAPConnection. In this way, somebody who wants
to reuse your component can simply drop an instance of it into their
context, without having to hard-wire/override the 'connection' attribute of
your component. The object that owns the connection simply does (for example):
somethingIdontCareWhatItsCalled = binding.bindTo('somethingThatGetsIMAP',
provides=IIMAPConnection)
Notice that the contained component doesn't have to care whether the
connection is named 'connection' or not. This principle can and probably
should be applied to most of the other components you are linking to with
'bindToParent()' and/or 'bindTo("../something")'. If interfaces are too
much trouble to define because you don't need them for anything else, you
can always use property names instead, e.g.:
# client
dm = binding.bindToProperty('acmgr.IMAP.FolderDM')
# provider
myFolderDM = binding.New(IMAPFolderDM,
provides=PropertyName('acmgr.IMAP.FolderDM'))
And finally, you can always use implicit acquisition
(bindTo('someNameWithoutDotsAtTheFront')), as long as you choose relatively
unique names. Hardwired relative paths are really only appropriate for
very tightly coupled components. Because if you use them, your components
*will* be coupled rather tightly whether you meant them to be or not.
Also, using interfaces or property names is much clearer as to intention
than most other binding types. If you bind to an interface, a reader then
*knows* that you intend the attribute should implement the interface. He
also knows that he need only have something declared as that interface
within the context of the component, in order to use it.
You may wonder how to deal with something like your
'../folderACLDM/listACLs' binding; I would do this:
class IFolderACLDM(storage.IDataManager):
#...other stuff
listACLs = Interface.Attribute("an IIMAPFolderACLQuery instance")
ACLs = binding.bindTo(IFolderACLDM)
and then simply refer to 'self.ACLs.listACLs[folder]', which is close
enough to the Law of Demeter for me. If you want to be particular, or
really can't afford that extra attribute lookup on every call, you can just
add:
listFolderACLs = binding.bindTo('ACLs/listACLs')
and then do as you have done from there. Notice here that the path is
local to the object, so a reader can understand this code
locally. Probably it should really be './ACLs/listACLs' to make it crystal
clear that it's a local reference. I'm usually just too lazy to type those
extra two keystrokes when the reference is within a few lines of the
binding for the named attribute...
* Nice style on the 'stateForMessage()' method; by that I mean giving the
DM a common way to extract an object's state, so that the method can be
shared by query DM's that collaborate with it. I'll have to remember that
trick to use myself! Maybe even write it up as a standard technique.
* You might want to create a "TxnUnsafeDM" base class (or maybe I should
add one to PEAK!) to use as a base for your EntityDM's. The idea would be
that it would raise an error on transaction abort if 'saved' were
non-empty, as we discussed on the list earlier this week. Right now, if a
transaction rollback is attempted following a flush() or commit attempt,
there will be no indication that the transaction partially committed.
* Speaking of flush(), I don't recall seeing any of your IMAP DM's calling
flush() on the EntityDM they get their records for during _load(). This
means that if a user (for example) adds a folder during a transaction, and
then looks at the parent folder's child list, they will not see the new
folder unless the parent's folder list had been viewed before the add
occurred. In general, it's best to have QueryDM's flush their
corresponding EntityDM's before executing a query against external state.
* I notice that overall you've emulated the style of PEAK itself, with
regards to naming and structuring conventions. Please make sure that this
is appropriate for your application; I'm not saying it's not, mind you, but
I've done little application development with PEAK as yet and don't know
how *I* feel about using the same style. I expect that when I do
application code, for example, I will be more likely to used MixedCaseNames
for modules, because brevity will be less important than clarity for an app
versus a framework. I also expect to use flatter package hierarchy than in
PEAK, because most apps aren't as big as PEAK! And I expect there to be
many other such style differences, either more or less subtle than the
preceding. So, I encourage everyone to be diverse in their styles, so that
people don't get the impression that if you use PEAK, your code's got to
look like PEAK. :)
>message-listing of a folder is a QueryLink to a MessageListQuery
>that retrieves the ids of the messages in this folder.
>
>i tried 2 implementations:
>
> 1. list only the ids and initialize the PersistentQuery with
> [ dm[id] for id in message_ids ]
>
> 2. list the ids and fetch all headers and then initialize the
> Persistent Query with
> [dm.preloadState(<oid>, <somestate>) for xx in message_ids ]
>
> the first solution gives back a list of message-objects fast
> and allowes using slices of the list (without knowing the content)
> but it is fairly slow because it needs to contact the imap-server
> for every single mail to fetch the header-info (dm._load(xxx))
>
> the second solution needs some time to get loaded but has
> preloaded objects with message-headers ready for listing.
> the second solution is good for mailboxes with up to 1000 messages,
> the first solution is better for mailboxes with more messages.
>
> possible solution: create a policy that switches the implementation
> based on the message-count.
That's perfectly acceptable, as is the use of a configuration property to
set the threshhold count, so that users can easily change it very high or
very low if it doesn't match their expectations...
>----------------------------------------------------------------
>SIEVE (a mail-filtering-language for cyrus-imap):
>- NamingProvider for SIEVEConnections
>- SIEVE Elements (SieveScript) -> Model
>- Datamangers (EntityDM's for access, QueryDM for scriptlisting
> LazyLoader for Script-Download)
>
>----------------------------------------------------------------
>LDAP-Account (work has just started and is not yet ready):
>- Basic Model (User)
>- UserDM for load/save/new of User-Objects.
>
>open questions:
>(a lot) but the most important:
>
>what is the oid for the ldap-stored objects -> the DN ??
>the DN is the only unique identifier for objects in ldap.
The only time I'd use anything *other* than the DN as an LDAP oid, is if I
had a restricted set of types of objects I was retrieving, and I didn't
need to reference anything polymorphically (i.e., without knowing its type,
and therefore what keys to retrieve it by).
Note that although DN is the only thing guaranteed by LDAP itself to be
unique, it's perfectly acceptable to declare restrictions on the schema
required by a particular application. For example, I have an app that uses
an attribute in LDAP that is required to be unique. (The LDAP server has
an extension that enforces it, but it could also be enforced by the
applications using the server.)
>if we use the DN, we need a way to build the DN when creating
>objects. we thought of using a similar way as in our IMAP solution:
>e.g. an User-Object should be stored in a OrganizationalUnit,
>then we would for example do this:
><begin> ...
>ou = ouDM['ou=people,dc=net-labs,dc=dev']
>user = userDM.newItem()
>user.cn = 'Some User'
>user.sn = 'User'
>user.givenName = 'Some'
>ou.addChild(user)
><commit> ...
>
>the newly created user-object would be stored using the following _p_oid:
>def _new(self, ob)
>
> oid = 'cn=%s, %s' % (ob.cn, self.ouDM.oidFor(ob.parent))
>
>where ou.child is a model.Collection
>and user.parent is a model.Attribute with proper referencedEnd declarations.
>
>and there comes another question what should "def _thunk(self, ob)" do ??
>
>if i would say in the above example:
>
> oid = 'cn=%s, %s' % (ob.cn, self.oidFor(ob.parent))
> ^^^^^^^^^^
>then i need to implement the userDM._thunk(self, ob) method.
>
>any hints for this one ??
Your _thunk() should check whether the object's DM is of the right type
(i.e. supports a compatible interface) and that it is using the same LDAP
connection, and if so, return the object's _p_oid. If not, it should
complain of a cross-database reference error. (Unless you want to
implement referrals, in which case _thunk() should search the LDAP db for a
referral to the target object, and return the referral's DN, or if not
present, create a referral entry in the local LDAP db, and point it to the
other object. Does that make sense? _thunk() is a hook for doing
cross-database references. In this case, you don't really want or need a
cross database reference, so all you want to do is verify that this isn't
really a cross-database reference, and that it's therefore acceptable to
use the same oid.
Note that you could also get away from this issue by using a single LDAP_DM
which looked at the 'objectclass' field of records it retrieved, in order
to determine what class to assign them. This is the route I plan to go
with my own apps, because I need the ability to reference objects by DN
regardless of type, in order to implement certain LDAP features (e.g. the
'seeAlso' field, which doesn't tell you what kind of object you're supposed
to "see also") and also to support cross-db references from relational
databases pointing to LDAP objects.
Anyway, to take that approach, just override the '_ghost()' method, and if
the 'state' argument isn't supplied, load the 'objectclass'
attribute. Then return an instance of the appropriate type, without
loading its state unless it was supplied by the caller. Voila.
If you want to have DM's that specialize in a particular application-domain
type, just use a FacadeDM that adapts its keys to the target. For example,
you could have a FacadeDM for users that looks them up by login name, by
doing a query and returning the LDAP_DM.preloadState() of the found
data. The object's oid is still its DN, of course, and so can be used to
key into the LDAP_DM in future.
Anyway, in this scenario there's no need for _thunk() because all the
'oidFor()' calls go to the same DM that's asking for the oid.
>i'm pretty impressed that a non trivial thing like accessing
>IMAP in a nice python-object-tree is fairly simple using
>peak.storage and it seems to have a really good performance
>due to its load-on-demand design.
>
>i like it very much :)))
Thank you for having the courage to "eat my dogfood", so to speak,
especially on the rapidly-changing peak.model stuff. (At least I am almost
done the refactoring!)
I hope to start work on a detailed 'peak.model' tutorial soon, using
'graphviz' as a basis for a long-running example, starting with a simple
model of 'Node', 'Edge' and 'Graph' classes, building up to having
'Styles', using model.Enumerations for things like the colors, shapes,
arrowheads, etc. The ultimate goal will be to have an example app that
reads a UML model and generates an inheritance diagram or dependency
diagram or something like that. Intermediate results will be things like a
similar diagram generated from Python classes, and maybe even from the
application's own domain model!
I think I'll be able to find lots of motivating examples to show framework
extension techniques like adding domain-specific metadata to classes,
defining specialized types (like my OddNumber example way up earlier in
this e-mail), and so on. And, I'd like to be able to use the resulting
library for some things myself. :)
I can already see a lot of how it will end up in the complex version, but
if I build it up from very small parts, I'll get to share my design thought
process in the tutorial, showing why you might do things different ways
with different PEAK capabilities. In the process, I expect the tutorial
will also showcase selective use of binding, naming, config, and even
'util' features like 'IndentedStream', but the main focus will be on
'peak.model'. The discussions we've had lately on the mailing list, and
the work I've been doing with the metamodels package and the model package
refactoring, have made it clear that there's some very good stuff here but
if I don't do a *lot* of good quality documentation, people will barely
scratch the surface of its capabilities.
More information about the PEAK
mailing list