The PEAK Developers' Center   Diff for "SqlDm" UserPreferences
 
HelpContents Search Diffs Info Edit Subscribe XML Print View
Ignore changes in the amount of whitespace

Differences between version dated 2005-10-11 09:17:25 and 2006-07-25 12:22:06

Deletions are marked like this.
Additions are marked like this.

It's an object-relational mapper for use with SQL DB's. It's based on peak.storage and peak.model and is an evolution of [http://cvs.net-labs.de/cgi-bin/viewcvs.cgi/*checkout*/libs/nll/src/nll/database/sqldm/datamanager.py?rev=1.10&content-type=text/plain some code] [http://www.eby-sarna.com/pipermail/peak/2003-October/000778.html Ulrich Eck posted to the mailing list].
 
= Development Status =
You can use it. It works great.
 
SQLEntityDM and SQLQueryDM work: inserting rows, deleting them, and updating them are all hit by my production code. In addition, the API should be pretty stable (until the model metadata metamorphosis (explained below) if it happens). I have a dozen SQLEntityDM subclasses (with more on the way) to maintain now, so future changes will be considered with commensurate gravity. I hereby dub this '''version 0.1'''. As always, this code has no warranty and no guarantees.
 
I still intend to have a Trac+svn set up for this someplace before long; I'll post a link when I do.

 
Thanks for your thoughts!
 
This code now comes in a package, though for easy diff's sake, I'm pasting the latest datamanager.py below. If you want to use this code, [attachment:sqldm.zip get the package].
 
= History =
2005-10-11 (r943). Goal: API stability.
 * EntityDM no longer blows up if you don't have any readOnlyFields.

 * Removed _thunk(), which was unnecessary.
 * Fixed some spelling, spacing, and style, and improved comments.
 
= The Code =
{{{
# NetLabsLibs -- a Python Framework for Business Objects
#
# Copyright (C) 2003 net-labs Systemhaus GmbH
# info@net-labs.de
#
# This library is free software; you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation; either
# version 2.1 of the License, or (at your option) any later version.
#
# This library is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public
# License along with this library; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
#
#
#
# Outdated CVS tags:
# $Header: /data/cvs/libs/nll/src/nll/database/sqldm/datamanager.py,v 1.10 2003/06/03 21:08:34 jack Exp $
# $Revision: 1.10 $
#
# TODO:
# Replace deprecated calls:
# * dm.__getitem__ ==> dm.get
# * newItem ==> add
# * delItem ==> remove
# Dispense with `fields` attr and probably the rest of them too; use binding.metadata facilities instead.
#
# See #peak IRC logs from 2005-08-10.
"""
Simple, PyDO-like object-relational map based on peak.storage
 
Example of use:
 
class ContactDM(SQLEntityDM):
 
    # Bound Datamanagers
    contactTypesDM = binding.Obtain('contactTypesDM')
    contactGroupsDM = binding.Obtain('contactGroupsDM')
 
    # Local QueryDM Instances
    listContactPersonsFor = binding.Make(listCPForContactQueryDM)
 
    ## Configuration:
    
    # ElementClass to be used for this EntityDM
    defaultClass = Contact
    
    # Table name in DB
    table = 'contact'
    
    # Fields to read from the DB, mapped to the feature name to expose them as, in (('COLUMN', 'FEATURE'),) format. If you want it read but not mapped (e.g., if it's a foreign key), put None into the second element of the tuple.
    fields = ('id', # equivalent to ('id', 'id', )
              ('contact_type_id', None, ), # TODO: Should become Field('contact_type_id', mapToFeature=None)
              'name', # TODO: Should become Field('name')
              'street',
              ('zip_code', 'zipCode', ), # TODO: Should become Field('zip_code', mapToFeature='zipCode')
              'city',
              'state',
              'country',
              ('contact_group_id', None, ),
             )
    
    # Specify a unique index:
    # The primaryKey fields must also be in either sequenceSpec or autoIncrementedField, because otherwise, I have no way to determine the primary key of a newly-inserted row.
    primaryKey = ('id', )
    
    # Specify which fields we shouldn't try writing to the DB (but should rather let the DB figure out itself). We shouldn't try writing them even when inserting a new row.
    # (TODO: Does PEAK already provide a way to figure out from the model what's read-only? If so, we can perhaps stop specifying it here.)
    readOnlyFields = ('id',) # In this example, id is an autoincrement field, incremented by the DB. Another possible use of this would be for DB-maintained timestamp fields. # TODO: This should become a keyword arg to the Field constructor, as above.
    
    # Specify fields automatically filled from sequences (FIELD, SEQUENCE):
    sequenceSpec = (('id', 'seq_contact_id'), ) # TODO: rename this for consistency # TODO: This should become a keyword arg to the Field constructor, as above.
    
    # If (1) your DB doesn't support sequences AND (2) you are using a DB-generated autoincrementing field as your primary key (or part of it), set autoIncrementedField to the name of that field:
    autoIncrementedField = None # In this example, my DB supports sequences, so I put the 'id' field in sequenceSpec. If I were using MySQL or MS SQL Server (which don't support sequences), I would have set autoIncrementedField = 'id' and left sequenceSpec blank. # TODO: This should become a keyword arg to the Field constructor, as above. There should be only one Field allowed with isAutoIncremented=True.
    
    def _otherStateForRow(self, row):
        state = {}
        state['contactType'] = self.contactTypesDM[(row['contact_type_id'],)]
        state['contactGroup'] = self.contactGroupsDM[(row['contact_group_id'],)]
        state['contactPersons'] = storage.QueryLink(self.listContactPersonsFor[(row['id'],)])
        return state
 
    # _otherDbValuesForElement() collects the necessary information
    # to be able to save the Element's state into the DB
    contactTypeDM = binding.Obtain(storage.DMFor(ContactType))
    contactGroupDM = binding.Obtain(storage.DMFor(ContactGroup))
    def _otherDbValuesForElement(self, ob):
        values = {}
        values['contact_type_id'] = contactTypeDM.oidFor(ob.contactType)[0]
        values['contact_group_id'] = contactGroupDM.oidFor(ob.contactGroup)[0]
        return values
"""
 
from peak.api import *
from peak.binding.once import getInheritedRegistries
from peak.storage.interfaces import InvalidKeyError
from protocols import adapt
 
from sqldm.interfaces import *
 
 
__all__ = ['SQLEntityDM', 'SQLQueryDM', 'ParamlessQueryDM']
 
 
 
# This should get mostly or completely turned into binding calls in SQLEntityDM.
class SQLEntityDMMeta(storage.EntityDM.__class__): # XXX Did inherit from ActiveClass first and too.
    """Metaclass for DM that builds fields, fieldNames, indexes attributes and stores them in a class-dict for later use"""
    
    # XXX
    # TODO Put _ in front of the vars we don't mean to make public. I already access featureFieldMap from the outside.
    def __init__(self, className, bases, classDict):
        super(SQLEntityDMMeta, self).__init__(className, bases, classDict)
        
        
        # inspect classDict for "fields" and "primaryKey" attribute
        # and setup class attributes
        
        fields = []
        for f in getInheritedRegistries(self, '_fields'):
            fields.extend(f)
        
        classFields = list(classDict.get('fields', [])) # classDict.fields being a tuple, it'll be copied by value
        # Let 'someFieldName' be a shortcut for ('someFieldName', 'someFieldName'):
        for (i, curField) in enumerate(classFields):
            if not isinstance(curField, tuple):
                classFields[i] = (curField,) * 2
        # Someday we may do the same for primaryKey. Maybe we should do this with adaptation or something instead.
        
        fields.extend(classFields)
        self._fields = tuple(fields)
        self.fieldNames = tuple([f[0] for f in fields]) # TODO: Name this self.fields or something for consistency with self.table.
        
        fieldMap = []
        referencingColumns = []
        for f in fields:
            if f[1] is None:
                referencingColumns.append(f[0])
            else:
                fieldMap.append(f)
        
        # Make fieldFeatureMap something like this:
        # {'number_of_teeth': 'numberOfTeeth', 'id': 'id', 'name': 'name'}
        # It apparently maps DB field names back to their model feature names.
        self.fieldFeatureMap = dict(fieldMap)
        
        # And map in the other direction for featureFieldMap:
        self.featureFieldMap = dict([(f[1], f[0]) for f in fieldMap])
        
        self.referencingColumns = tuple(referencingColumns)
        
        
        indexes = []
        for i in getInheritedRegistries(self, 'indexes'):
            indexes.extend(i)
        indexes.extend(classDict.get('primaryKey', []))
        self.indexes = tuple(indexes)
        
        
        sequences = []
        for s in getInheritedRegistries(self, 'sequences'):
            sequences.extend(s)
        sequences.extend(classDict.get('sequenceSpec', []))
        self.sequences = tuple(sequences)
        
        self.sequenceMap = dict(sequences)
        self.sequencedColumns = tuple([c for c,s in sequences])
 
        mappedFields = self.fieldFeatureMap.keys()
        for index in self.indexes:
            if (not index in mappedFields) and (not index in referencingColumns):
                raise ValueError, 'Wrong Index Specification: %s' % index
 
 
 
class SQLEntityDM(storage.EntityDM):
    """SQL Datamanager configurable through attributes
    
    Inspired by PyDO
    
    Open Issues:
     * using Sequences requires support from Connection
     * using parameterized Queries requires support from Connection
    """
 
    __metaclass__ = SQLEntityDMMeta
 
    protocols.advise(instancesProvide=[ISQLEntityDM])
 
 
    # ISQLEntityDM attributes
    connection = binding.Obtain(storage.ISQLConnection)
    table = None
    fields = ()
    primaryKey = ()
    sequenceSpec = {} # Should this be [] or () instead? --erikrose
    readOnlyFields = ()
    autoIncrementedField = None
    writableFields = binding.classAttr(binding.Make(lambda self: tuple([x for x in self.fieldNames if x not in self.readOnlyFields]))) # all the field names that are in fields but not in readOnlyFields. O(n^2), but happens only once per class. Fix it if you like. # Make it so you don't have to access this uglyly, like self.__class__.writableFields. Maybe just self.writableFields = self.__class__.writableFields.
    
    
    # Pieces of SQL. We can compute these once and keep them cached, since we're using bind variables and just subbing the data into the query string.
    
    @binding.Make
    def _sqlWhereClause(self):
        """Return a SQL "where" clause that, within the scope of the table `self.table`, finds the row I'm custodian of. Return value starts with a space (or something else that will let it be appended harmlessly to other SQL fragments)."""
        keyTuple = zip(self.indexes, self.getBindVariables(self.indexes))
        return ' where %s' % ' and '.join(['%s=%s' % (k, v) for k,v in keyTuple])
    
    def __sqlForSelecting(self, selectingWhat):
        """Return the SQL for selecting the fields named in the list `selectingWhat` from the row found by `self._sqlWhereClause` in table `self.table`."""
        return 'select %s from %s%s' % (', '.join(selectingWhat), self.table, self._sqlWhereClause)
    
    @binding.Make
    def _sqlForSelect(self):
        return self.__sqlForSelecting(self.fieldNames)
    
    @binding.Make
    def _sqlForSelectingReadOnlyFields(self):
        return self.__sqlForSelecting(self.readOnlyFields)
    
    @binding.Make
    def _sqlForInsert(self):
        return 'insert into %s (%s) values (%s)' % (self.table, ', '.join(self.__class__.writableFields), ', '.join(self.getBindVariables(self.__class__.writableFields)))
    
    @binding.Make
    def _sqlForUpdate(self):
        fieldTuple = zip(self.__class__.writableFields, self.getBindVariables(self.__class__.writableFields))
        return 'update %s set %s%s' % (self.table, ', '.join(['%s=%s' % (k, v) for k,v in fieldTuple]), self._sqlWhereClause)
    
    @binding.Make
    def _sqlForDelete(self):
        return 'delete from %s%s' % (self.table, self._sqlWhereClause)
    
    
    # Let us say someSQLEntityDM[99] instead of someSQLEntityDM[(99,)]:
    
    def __getitem__(self, oid, *args, **kwargs):
        if not isinstance(oid, tuple):
            oid = (oid,)
        return super(SQLEntityDM, self).__getitem__(oid, *args, **kwargs)
    
    def get(self, oid, *args, **kwargs):
        if not isinstance(oid, tuple):
            oid = (oid,)
        return super(SQLEntityDM, self).get(oid, *args, **kwargs)
    
    
    # Interface Methods for IWritableDM
 
    def _load(self, oid, ob):
        row = self._loadRow(oid)
        return self.stateForRow(row)
    
    def _save(self, ob):
        """Save `ob` to underlying storage."""
        oid = ob._p_oid # Even if _p_* vars go away in PEAK, I don't think it'll be a big deal to add a hook so individual DM subclasses can specify them explicitly or to just change this to reference wherever the "new" PEAK decides to store oids. Actually, it looks like this class doesn't read ob._p_oid until it's written it first (in _new, below), so maybe this will continue to work unmodified. ob._p_jar I don't know about. --erikrose
        self.connection(self._sqlForUpdate, self.formatBindValues(self.dbValuesForElement(ob) + list(oid), self.__class__.writableFields + self.indexes))
    
    #def _defaultState(self, ob):
    # # XXX Fetch default state from SQL-Schema ???
    # return {}
    
    def _new(self, ob):
        """Create `ob` in underlying storage, and return its new oid."""
        
        self._setSequencedFields(ob) # before the INSERT so sequenced fields' values can be inserted
        self.connection(self._sqlForInsert, self.formatBindValues(self.dbValuesForElement(ob), self.__class__.writableFields))
        self._setAutoIncrementedField(ob)
        
        oid = self._oidForElement(ob) # Do this _after_ inserting, because our oid could well be determined by one of the readOnlyFields or by the autoIncrementedField.
        self._setReadOnlyFields(oid, ob)
        return oid
    
    
    # Delete Item
    
    def delItem(self, ob):
        oid = ob._p_oid
        self.connection(self._sqlForDelete, self.formatBindValues(oid, self.indexes))
        if oid in self.cache:
            del self.cache[oid]
    
    
    # ISQLEntityDM methods
    
    def oidForRow(self, row):
        """Default implementation: the oid of a row is a tuple of the fields that comprise its primary key."""
        return tuple([row[i] for i in self.indexes])
    
    def _oidForElement(self, ob):
        """Return the oid of the Element `ob`.
        
        (Not really an ISQLEntityDM method--if you override _new(), you can ignore it--but I wanted to put it near oidForRow().)
        Default implementation: return a tuple of the values of those attributes of `ob` specified by and parallel to self.indexes. For example, if self.indexes were ('color', 'weight'), return (ob.color, ob.weight).
        
        Pre: `ob` has been flushed to the DB (since only then is it guaranteed to have an oid).
        """
        return tuple([getattr(ob, self.fieldFeatureMap[i]) for i in self.indexes])
    
    def stateForRow(self, row):
        """Default implementation: union of _mappedStateForRow() and _otherStateForRow(), the latter taking precedence in the case of duplicate keys."""
        union = self._mappedStateForRow(row)
        union.update(self._otherStateForRow(row))
        return union
    
    # I don't know if it ought to be done here or not (for one thing, doing it here wouldn't fix querying by an Enum), but someplace we have to rig it so, when a model with an Enumeration in it is saved, the Enumeration's .value (e.g., 'BLUE') is passed to the DB instead of some useless Python string like 'HatColor.blue'. Maybe a generic function? Maybe doing it manually in _dbValuesWithAddenda()?
    def dbValuesForElement(self, ob):
        """Default implementation: union _mappedDbValuesForElement() and _otherDbValuesForElement(), the latter taking precedence in the case of duplicate keys. Then, convert the resulting dict to a list, ordered by the order of fields in self.__class__.writableFields."""
        union = self._mappedDbValuesForElement(ob)
        union.update(self._otherDbValuesForElement(ob))
        return [union[x] for x in self.__class__.writableFields]
 
    
    # Helper methods intended to be overridden
    
    def _otherStateForRow(self, row):
        """Typical use: map foreign key fields to other DM's.
        
        For example, if I were the DM for a Person Element with a "city" attribute, I might return {'city': self.cityDM[row['cityId']]} (where self.cityDM is a DM that returns a City by id).
        
        If you need more power, forget about this function and just override stateForRow().
        """
        return {}
    
    def _otherDbValuesForElement(self, ob):
        """Typical use: return a dict that, for each reference this Element holds to a foreign Element, maps the foreign key field's name to the foreign id.
        
        For example, if this Element references a Seminar Element with a primary key of 3 and represents that in the DB by putting an integer in my table's seminarId column, I might return {'seminarId': 3}.
        
        If you need more power, forget about this function and just override dbValuesForElement().
        """
        return {}
    
    
    # Helper methods, not typically overridden. These are part of my private API, though, so feel free to call them from subclasses.
    
    def _loadRow(self, oid):
        """Given an oid, fetch the one row from the DB that has the info to populate me. Raise InvalidKeyError if there is no such row.
        
        I broke this off of _load() so you can override _load() as necessary without having to reinvent the wheels this function contains.
        If you need more than one row, forget about this function and just override _load().
        """
        try:
            return ~self.connection(self._sqlForSelect, self.formatBindValues(oid, self.indexes))
        except exceptions.TooFewResults:
            raise InvalidKeyError, oid
    
    def _mappedStateForRow(self, row):
        def getFeatureClass(attrName):
            return getattr(self.defaultClass, attrName, None)
    
        state = {}
        for field, feature in self.fieldFeatureMap.items():
            value = row[field]
            
            # Normalize value based on type-object:
            featureClass = getFeatureClass(feature)
            if featureClass is not None:
                value = featureClass.normalize(value)
            
            state[feature] = value
        return state
    
    def _mappedDbValuesForElement(self, ob):
        """For each field that is mapped to a feature using the facilities of this class, return a dict mapping each field's name to its DB-native value."""
        fieldToFeature = self.fieldFeatureMap.get
        featuresSomeOfWhichAreNone = [fieldToFeature(field) for field in self.__class__.writableFields]
        return dict([(field, getattr(ob, feature)) for (field, feature) in zip(self.__class__.writableFields, featuresSomeOfWhichAreNone) if feature is not None]) # None means "don't map me".
    
    def _setSequencedFields(self, ob):
        """Find any of the sequencedColumns whose values are NOT_GIVEN, and fill them from their respective sequences."""
    
        def _fillHole(ob, feature, withWhat):
            """If ob.feature exists and is undefined, fill it with the result of calling withWhat()."""
            if feature is not None: # It would be None if it correlated to a field that gets read from the DB but not mapped to a feature, e.g. a foreign key.
                value = getattr(ob, feature, NOT_GIVEN)
                if value is NOT_GIVEN:
                    setattr(ob, feature, withWhat())
                #else:
                # raise ValueError, 'Missing Value for Feature: %s' % feature # TODO maybe still warn about this somewhere; can't do it here because it's okay if readOnlyFields are still NOT_GIVEN at this point
        
        for curField, curSequence in self.sequenceMap:
            feature = self.fieldFeatureMap[curField] # Get feature name associated with this column
            _fillHole(ob, feature, lambda: self.connection.getSequenceValue(curSequence))
    
    def _setAutoIncrementedField(self, ob):
        if self.autoIncrementedField:
            autoIncFeature = self.fieldFeatureMap[self.autoIncrementedField]
            if autoIncFeature: # This is None if it's a foreign key or some otherwise read-but-not-mapped-to-a-feature field. TODO: Program more semantically here when we start using keyword args to the Field constructors.
                conn = self.connection
                setattr(ob, autoIncFeature, (~conn(conn.getLastAutoincrementSql()))[0]) # TODO roll this SQL call into the SQL statement _setReadOnlyFields() executes
    
    def _setReadOnlyFields(self, oid, ob):
        """Fill all readOnlyFields from the DB."""
        # TODO?: Don't do this query unless necessary.
        if self.readOnlyFields:
            row = ~self.connection(self._sqlForSelectingReadOnlyFields, self.formatBindValues(oid, self.indexes))
            
            for curField in self.readOnlyFields:
                feature = self.fieldFeatureMap[curField]
                if feature is not None:
                    setattr(ob, feature, row[curField])
    
    def getBindVariables(self, params):
        """Return Bind Variable for ParamStyle Queries"""
        count = len(params)
        style = self.connection.API.paramstyle
 
        if style == 'format':
            return ['%s'] * count
        if style == 'qmark':
            return ['?'] * count
        if style == 'numeric':
            return [':%d' % d for d in range(count)]
        if style == 'named':
            return [':p%d' % d for d in range(count)]
        if style == 'pyformat':
            return ['%('+p+')s' for p in params]
        raise ValueError, 'Invalid ParameterStyle defined by dbapi'
    
    def formatBindValues(self, values, fieldNames):
        """Return the given values in a format suitable for tossing at the DB adapter as a bunch of values to bind.
        
        values -- a tuple of the values to bind
        fieldNames -- a parallel tuple of the field names of each value
        
        Basically, just convert a tuple to a dict if the DB adapter uses the pyformat paramstyle.
        """
        if self.connection.API.paramstyle == 'pyformat':
            return dict(zip(fieldNames, values))
        else:
            return values
 
 
 
class SQLQueryDMMeta(storage.EntityDM.__class__): # XXX Did inherit from ActiveClass first and too.
    """Metaclass for DM that prepares the class"""
    
    # XXX
    def __init__(self, className, bases, classDict):
        super(SQLQueryDMMeta, self).__init__(className, bases, classDict)
 
 
        # inspect classDict/bases for "querySpec" attribute
 
        query = []
        for i in getInheritedRegistries(self, 'query'):
            query.extend(i)
        query.extend(classDict.get('querySpec', []))
        self.query = tuple(query)
 
 
 
class SQLQueryDM(storage.QueryDM):
    """Parameterized SQL Query Datamanager configurable through attributes. Preloads queried data into the EntityDM."""
    
    __metaclass__ = SQLQueryDMMeta
 
    protocols.advise(instancesProvide=[ISQLQueryDM])
 
    sqlDM = binding.Require('A configured SQLEntityDM Instance')
 
    # ISQLQueryDM Attributes
    querySpec = []
    
    @binding.Make
    def _sqlForSelect(self):
        def whereClause():
            querySpec = self.querySpec
            if querySpec:
                params = dm.getBindVariables([q[0] for q in querySpec])
                where_tuple = zip(querySpec, params)
                return ' where %s' % ' and '.join([q[0] + q[1] + p for q,p in where_tuple])
            else:
                return ''
        dm = self.sqlDM
        return 'select %s from %s%s' % (', '.join(dm.fieldNames), self.sqlDM.table, whereClause())
    
    def _load(self, oid, state=None):
        """Return a list"""
        dm = self.sqlDM
        conn = dm.connection
        
        if oid is NOT_GIVEN: # I was called parameterlessly.
            rows = conn(self._sqlForSelect)
        else:
            rows = conn(self._sqlForSelect, dm.formatBindValues(oid, dm.indexes))
        # QueryDM takes care of caching the oid-to-resultset mappings for us.
        
        return [dm.preloadState(dm.oidForRow(row), dm.stateForRow(row)) for row in rows] # preloadState() caches the individual rows in the DM
 
 
 
class ParamlessQueryDM(object):
    """A mix-in to support nice iteration syntax on QueryDM's that don't take any query parameters (i.e. no oid).
    
    For example, this lets you do this...
        for each in myQueryDM:
            print each
    ...instead of this kludge:
        for each in myQueryDM[0] # where 0 is a dummy param and utterly ignored
            print each
    """
    def __iter__(self):
        return iter(self[NOT_GIVEN])
}}}
= Download =
[attachment:sqldm.zip Get the package].

PythonPowered
ShowText of this page
EditText of this page
FindPage by browsing, title search , text search or an index
Or try one of these actions: AttachFile, DeletePage, LikePages, LocalSiteMap, SpellCheck