[PEAK] DB type conversion (for inputs to execute() and callproc())
Phillip J. Eby
pje at telecommunity.com
Tue Dec 16 18:33:44 EST 2003
Currently, peak.storage.SQL supports automatic type conversion on all data
*received from* the database. It can do this because, for a given database
type there is arguably one "best" Python type to render that value as, for
the given application.
However, it does not currently support type conversion for values that are
being *sent to* the database. One must manually render the values as the
correct datatype (using e.g. 'db.API.DateFromTicks()'), and then either
convert them to strings and embed them in SQL, or supply them as parameters
to the cursor.execute() method.
But for stored procedures, this will be inconvenient at best. One of the
goals for stored procedures is that they should be able to accept
application types (such as the Python 2.3 'datetime' types included with
PEAK) and convert them to the necessary database type(s).
Therefore, it would seem a conversion mechanism is necessary. Here are the
scenarios I think it might need to cover:
* convert this Python object to "something that works with the DB" in a
general sense (i.e., the specific DB target type is unknown)
* convert to a DBAPI-level type, such as NUMBER, STRING, BINARY, etc.
* convert to a specific DB native type such as NUMBER(10,2) or some such.
The first item is critical. The second may be needed to eliminate
conversion ambiguities for stored procedure arguments. The third item
seems like a YAGNI, especially since correct implementation could be
complex. (And I didn't even list supporting per-connection configuration
for conversions, since the assumption here is that a given Python type
should have a single unambiguous representation for a given DB backend.)
So, what I propose is to create a 'types' attribute on DB connections, such
that e.g. 'db.types.NUMBER' would represent the protocol to adapt to to get
a "number" value from some python type. In addition to the DBAPI types
supported by that driver, there would also be an '_ANY_' protocol.
These protocols would be implemented as 'protocols.protocolForURI()'
objects within a driver-specific namespace. For databases with multiple
drivers (like Oracle and PostgreSQL), I'm not sure if we'll have any
overlap. If so, we may create a URI namespace for the "generic" DB, and
then have implied protocols for the specific drivers.
Anyway, by using protocolForURI(), it means that application code can
declare adapters for specific backends without those backends needing to be
installed. That is, if I want to declare an adapter from my application's
date class to "DCOracle2 datetime", I can do so by using the URI for
"DCOracle2 datetime", whether DCOracle2 is installed or not. (Of course,
you could use the 'whenImported()' function to set up a hook to do this,
too, but I think that's more awkward.)
In order to simplify the declaration process, I'll probably have
peak.storage.SQL export a utility function like
'storage.dbType(driver,type="_ANY_")' that returns the right
protocolForURI() object. Thus, one would declare one's adapter as
providing e.g. 'storage.dbType("DCOracle2","DATETIME")' to declare an
adapter to DCOracle2's DATETIME DBAPI type. And, the 'types' attribute of
a db connection would simply set its attributes to the dbType() objects for
the connection's driver.
Finally, functions that wanted to convert a Python object to a suitable
object for the database type would use 'adapt(anObject,
db.types.DBAPITYPENAME)' to get an object suitable for passing as an
execute() or callproc() parameter value.
Comments, anyone?
More information about the PEAK
mailing list