The PEAK Developers' Center   IntroToPeak/LessonFour UserPreferences
 
HelpContents Search Diffs Info Edit Subscribe XML Print View Up
Version as of 2003-12-09 22:02:10

Clear message


Up: IntroToPeak Previous: IntroToPeak/LessonThree Next: IntroToPeak/LessonFive

SQL, N2, and Naming

(Note: This lesson has not yet been edited by PJE)

OK, the big moment has come (hold your chuckles!). Management is impressed with our demo program, and they want us to take it into production, for use in greeting customers. There's a wrinkle, though: they aren't interested in having a custom greeting for every customer. Instead, they want us to access the corporate PostgreSQL database, check the group to which each customer has been assigned, and use a greeting associated with the customer's group to greet the customer.

So in this lesson we're going to learn how to use PEAK facilities to work with an SQL database, how to combine data from more than one database but keep them all in sync, and we'll also get to see just how flexible a PEAK program can be in the face of change. Along the way we'll introduce PEAK's "n2" utility, and give you a small glimpse of how powerful PEAK's naming system is. Oh, yeah, and we'll exercise the configuration system a bit more, too.

Then we'll finish it all off by throwing away our version 1 demo program and writing version 2 to match the new reality.

Contents

  1. SQL, N2, and Naming
    1. N2
    2. peak.ini, and Naming
    3. Greater Configurability
    4. Reading from the SQL Database
    5. Doing it Right

N2

"N2" stands for "Namespace Navigator", and it is a general purpose tool for browsing and manipulating any namespace for which someone has written a navigator adapter. It is available as a subcommand of the peak command:

%peak n2 
[n2] help 
Available commands: 
 
abort  commit l      ls     mksub  py     rd     rmsub  unset          
bind   dir    ll     md     mv     python rm     set                   
cd     help   ln     mkdir  pwd    quit   rmdir  show                  
Ty originally wanted to call it "into", so peak n2 is also a pun on "peek into".

Why is n2 of interest to us now? One of the n2 adapters avaiable is one that will allow us to talk to an SQL database. Not that that's the way our program will talk to the corporate database, mind. Rather, we want this tool to make it easy for us to set up a test SQL database we can test our modidified program against before we try it on the corporate database.

To do our testing, we'll use an SQLite database. (If you don't have SQLite on your system, you can get it from [WWW]http://www.hwaci.com/sw/sqlite/. You'll also need the PySQLite python interface library, available from [WWW]http://sourceforge.net/projects/pysqlite/. Or use your operating system's package facility if it has one (for example, on FreeBSD you could do "cd /usr/ports/database/py-PySQLite; make install").

Once these programs are installed, you can use n2 to work with a test SQLite database:

% peak n2 sqlite:test.db 
1> 
You'll note that the prompt is different this time.
1> help 
Available commands: 
 
abort        \abort       \buf-load    \echo        \import      \reset 
commit       \buf-append  \buf-save    \exit        \python      \rollback 
go           \buf-copy    \buf-show    \export      \quit        \set 
help         \buf-edit    \commit      \go          \reconnect   \sleep 
rollback     \buf-get     \describe    \help        \redraw      \source 
This n2 command interpreter is allowing you to enter SQL commands as the normal thing. So most n2 commands are preceeded by a "\". We can enter SQL commands, terminated by semi-colons, and they will be applied against the "test.db" SQLite databse. (Don't worry right now about how sqlite:test.db got turned into a connection to an SQLite database, we'll come back to that in the next section).

What we need is a test table that maps customer names to customer groups, since that's what we'll be accessing in the corporate database. The schema for that database lists the table as custgroups, and the columns of interest as NAME and GRP. Both columns are strings, VARCHAR(40) and (20), respectively. That's all we really need to know. In production our application will be using the strings from the corporate database, so we can use any old strings we like for our tests.

What we need to do is create a test table with the two critical columns, and some test data stored in it:

1> create table custgroups (NAME VARCHAR(40), GRP VARCHAR(20)); 
 
 
(0 rows) 
1> insert into custgroups (NAME, GRP) VALUES ('Jeff', 'vip'); 
 
 
(0 rows) 
1> insert into custgroups (NAME, GRP) VALUES ('Joe', 'peon'); 
 
 
(0 rows) 
1> insert into custgroups (NAME, GRP) VALUES ('Klause', 'vip'); 
 
 
(0 rows) 
1> insert into custgroups (NAME, GRP) VALUES ('Jackie', 'vvip'); 
 
 
(0 rows) 
1> insert into custgroups (NAME, GRP) VALUES ('Fred', 'oridinary'); 
 
 
(0 rows) 
select * from custgruops; 
NAME                 GRP                  
-------------------- -------------------- 
Jeff                 vip                  
Joe                  peon                 
Klause               vip                  
Jackie               vvip                 
Fred                 ordinary             
(5 rows) 
1> commit 
1> \quit 
Well, that was easy. Now we just have to tie it into our program.

Note that the "commit" step, which is an n2 command and not an sql command, is critical. N2 starts a transaction when it connects to the database. The commit command causes n2 to tell the database to commit the current transaction. If you don't do that, none of the changes made to create the table and insert the data will actually take effect.

peak.ini, and Naming

Remember back at the beginning when we took a look at the usage information for the peak command? The first argument to the peak command is described as "NAME_OR_URL", where a name is something defined in the peak.running.shortcuts section of an ini file, and URL is "of a type supported by peak.naming".

Well, by now you should understand how to define something in 'peak.running.shortcuts' and what that accomplishes, since we used it in the last chapter. It will thus probably come as no surprise to you to find that the 'peak' command has a ini file, and in that ini file there's a peak.running.shortcuts section. It makes interesting reading when you've got few moments. You'll find our friend runIni in there, as well as the n2 we just learned about.

Right now, though, we're interested in that second bit, the one about the URL. PEAK provides a general way to refer to "resources" via "names". Since "URL" stands for "Uniform Resource Locator", it's a natural syntax for PEAK to use. A URL is divided into multiple parts; we're only worried about the first two for now. The first part is what comes before the first ":", and the second part is what comes after. The first part is called the "scheme", and the second is a path that, when interpreted in the context of the scheme by the naming system, will yeild an object.

The peak command further requires that the referenced object support one of the `peak.running' interfaces or be callable. That's important: the naming system will look up anything it understands, but something that uses the naming system to look things up may have additional constraints on what referenced objects will actually work when specified in such a URL.

PEAK of course makes schemes and production rules completely configurable. The peak.naming.schemes section of the peak.ini file looks like this:

[peak.naming.schemes] 
 
# This section defines naming context factories or URL.Base subclasses to 
# be used for various URL schemes.  The entry name is the URL scheme, and 
# the value is either the object or an import string for loading it. 
 
import  = "peak.naming.factories.peak_imports:importContext" 
smtp    = "peak.naming.factories.smtp:smtpURL" 
uuid    = "peak.naming.factories.uuid:uuidURL" 
nis     = "peak.naming.factories.nisns:nisURLContext" 
config  = "peak.naming.factories.config_ctx:PropertyContext" 
 
ldap    = "peak.storage.LDAP:ldapURL" 
sybase  = "peak.storage.SQL:GenericSQL_URL" 
pgsql   = "peak.storage.SQL:GenericSQL_URL" 
psycopg = "peak.storage.SQL:GenericSQL_URL" 
mockdb  = "peak.storage.SQL:GenericSQL_URL" 
gadfly  = "peak.storage.SQL:GadflyURL" 
sqlite  = "peak.storage.SQL:SqliteURL" 
cxoracle = "peak.storage.SQL:OracleURL" 
dcoracle2 = "peak.storage.SQL:OracleURL" 
 
logfile = "peak.running.logs:logfileURL" 
logging.logger = "peak.running.logs:peakLoggerContext" 
 
lockfile     = "peak.running.lockfiles:lockfileURL" 
nulllockfile = "peak.running.lockfiles:lockfileURL" 
shlockfile   = "peak.running.lockfiles:lockfileURL" 
flockfile    = "peak.running.lockfiles:lockfileURL" 
winflockfile = "peak.running.lockfiles:lockfileURL" 
win32.dde    = "peak.storage.DDE:ddeURL" 
 
http    = "peak.naming.factories.openable:OpenableURL" 
ftp     = "peak.naming.factories.openable:OpenableURL" 
https   = "peak.naming.factories.openable:OpenableURL" 
file    = "peak.naming.factories.openable:FileURL" 
pkgfile = "peak.naming.factories.openable:PkgFileURL" 
icb     = "peak.net.icb:ICB_URL" 
 
 
 
tcp       = "peak.net.sockets.tcpudpURL" 
udp       = "peak.net.sockets.tcpudpURL" 
unix      = "peak.net.sockets.unixURL" 
unix.dg   = "peak.net.sockets.unixURL" 
fd.socket = "peak.net.sockets.fdURL" 
 
zconfig.schema = "peak.config.load_zconfig:ZConfigSchemaContext" 
shellcmd       = "peak.naming.factories.shellcmd:ShellCommandCtx" 
That's a lot of schemes. Take a look at that last one, `shellcmd'. The peak command usage says we can use any scheme recognized by peak.naming to provide the thing the peak command is going to run. Presumably that means that we could tell it to run, say, the "ls -l" command. Let's try it:
% peak shellcmd:"ls -l" 
total 15 
-rwxr-xr-x  1 rdmurray  wheel   272 Dec  7 22:04 hello 
-rw-r--r--  1 rdmurray  wheel   159 Dec  7 22:04 hello.list 
-rw-r--r--  1 rdmurray  wheel   201 Dec  7 22:04 hello_model.py 
-rw-r--r--  1 rdmurray  wheel  1013 Dec  7 22:04 hello_model.pyc 
-rw-r--r--  1 rdmurray  wheel  1097 Dec  7 22:04 hello_storage.py 
-rw-r--r--  1 rdmurray  wheel  3308 Dec  7 22:04 hello_storage.pyc 
-rw-r--r--  1 rdmurray  wheel  1519 Dec  7 22:04 helloworld.py 
-rw-r--r--  1 rdmurray  wheel  2878 Dec  7 22:04 helloworld.pyc 
Well, how about that. Not, perhaps, a stunningly useful way to run the "ls" command, but it certainly demonstrates the flexibility of the PEAK naming system.

The n2 subcommand we used in the last section also takes any argument supported by peak.naming. The referenced object must support a certain interface, but we'll ignore that for now.

When we used n2 to open a connection to our SQLite "test.db" database, we were using the "sqlite" scheme listed above. You'll note that there are several other database schemes supported by PEAK. That's important, because as you'll recall the corporate database our program really needs to connect to is Postgres. Fortunately, psycopg is a scheme for accessing a Postgres database, so we're all set.

In fact, if we actually had that corporate database to connect to, we could do something like:

% peak n2 psycopg://ouruserid:ourpassword@some.corporate.server.com/customerdb 
and run some queries against that table and find out what kind of data they are really expecting us to interface with.

Now, this is all well and good, but so far it hasn't gotten us any nearer making our program get group data from an SQL database. But PEAK wouldn't have a naming system if it didn't use it everywhere, of course, so you probably won't find it too surprising to that the naming.Obtain function also accepts anything handled by peak.naming, as long as it supports the IComponentKey interface or can be adapted to it.

Hmm. "...or can be adapted to it." Now explaining that would require quite a long digression. A very important digression if you really want to understand the essence of PEAK, but one that is far too long to get in to here. You can read [WWW]PEP 246 and/or the [WWW]PyProtocols [WWW]documentation if you want to delve into it now.

But all you really need to know for our purposes here is that within the PEAK system we have a way of adding behavior to objects in a very flexible way. So if we have an object that represents a connection to a database, and Obtain wants it to support IConfigKey, a little piece of code will make it do so in a way transparent to us as users of that particular service. What this means in practice is that in general when something uses the naming system to look things up, and it seems reasonable to use an object represented by a given URL, but you don't know if that object supports whatever interface the component doing the lookup requires, try it. There may be an adpater in the system that will take care of it. And if it doesn't work, but it still seems reasonable that it should, you can write an adapter so it will work. But that, as they say, is a topic for another day. (Or if I can keep at this long enough, another chapter.)

What this all means right now is that we can put something in our HelloWorld class like this:

db = Obtain("sqlite:test.db") 
and like magic our program will have a connection to our SQLite database.

Of course, that's just the beginning....

Greater Configurability

Before we start accessing that database, let's think a little about the fact that this is a test database we're working with. We know we're going to have to switch which database we access eventually. And that's something that could change in the future; after all, they might move the database to some new machine someday.

So really we should put the pointer to the database into the configuration file instead of burying it in the code. The naming system provides a little utility for facilitating this called LinkRef:

[helloworld] 
 
messagefile = config.fileNearModule('helloworld', 'hello.list') 
db = naming.LinkRef('sqlite:test.db') 
So now we've got a "link" to a named resource stored in our helloworld.customerdb configuration variable.

Hmm. But that isn't really quite what we want, either. After all, our project group is working on other projects using PEAK, and some of those will need to access to this same corporate database. It would be a shame to have to go to the config file for each app and adjust it if the database gets moved.

PEAK of course provides a way to deal with that. We can create a special section in our ini file named Load Settings From, and in that define a file variable pointing to another ini file. (Footnote: actually, you might want to use ZConfig here rather than an ini file, since ZConfig is probably better for "end user" configuration tasks, but that's another topic I'm not ready to delve in to.)

So we'll create a new file, global.ini:

[corporate] 
 
customerdb = naming.LinkRef('sqlite:test.db') 
And at the top of our hello file:
[Load Settings From] 
file = config.fileNearModule('helloworld', 'global.ini') 
In our deployed application we'll actually want to use either a hardcoded path name, or reference to a module related to the shared parts of our corporate PEAK applications.

This leads to another interesting point, though. If this global.ini file were already deployed, and we needed some of the settings in it, but we wanted to use our test database rather than the real customerdb, what could we do? Naturally enough, all it takes is to add a [corporate] section to our hello config file, and define in there the config variables we want to override.

Reading from the SQL Database

We're finally ready to use the database.

Before we do this the right way, though, lets conduct a little exercise. PEAK is designed to allow you to develop applications that are flexible and easy to modify. Suppose we were trying to do this database change on an emergency basis, and we wanted to touch as little of our existing system as possible? Let's see what we can do.

Our domain model pairs names with greetings. As far as the user of the to subcommand goes, that's all that matters. So lets see if we can maintain that pairing while using the data from our new database.

First, we'll need a reference to the database as a class variable in our DM.

    1 customerdb = binding.Obtain(PropertyName('corporate.customerdb'))

Next, we'll need to change our _load method to get the right message, using auxiliary info from the customer data base:

    1     def _load(self, oid, ob):
    2         row = ~self.customerdb("select GRP from custgroups where NAME='%s'" %
    3             oid)
    4         m = self.__findMatch(row.GRP, self.file.text)
    5         msg = m.group(1) % oid
    6         return {'forname': oid, 'text': msg}
Here you see that we can pass our SQL database object an SQL query, and for a select get back a cursor into the database. Since we're sure we're only going to get one record back, we use the unary operator "oneOf" ("~") to pick it out. The "row" object we get back has our select columns as attributes. So we get the data from the GRP column for this user, and use that to look up our message. You'll note that I'm back to doing "%" substitution on the resulting message, so that we are in fact personalizing our message for each customer. If corporate doesn't like that, it's easy enough to back out.

Nothing else in our application has to change. Although our usage message for the for subcommand is now wrong (it sets the message for the group, not the thing being greeted), it works just fine for storing messages into our database under the group names:

% ./hello for vip:"I'm so pleased to see you, %s" 
% ./hello to Jeff 
I'm so pleased to see you, Jeff 
Another nice feature worth pointing out, even though it isn't useful in our simple example, is that the SQL database object participates in the transaction. (In fact, there has to be an active transaction for the sql database function to work.) So if our subcommands were in fact updating the SQL database, both the changes to it and the changes to our EditableFile would be kept consistent across PEAK transaction boundaries. If our PEAK transaction aborts, all database changes, to whichever database, are rolled back (assuming the database supports transaction abort, of course).

And to go production, all we need to do is change that customerdb declaration in the configuration file. The PEAK database object hides all the details of the different databases from us. (Well, as long as we can stick to standard SQL it does, anyway.)

If you think about what you just learned about how PEAK can be used to tie together disparate, possibly legacy databases in the corporate setting, and maintain enterprise-level control over the integrity of the data, you can see why PEAK is an "Enterprise Application Toolkit".

Doing it Right

All right, that incorrect usage message is bothering me. Let's fix this application the right way.

Our domain model has actually changed: instead of the domain consisting of "Message" objects that are associated with particular things to greet, we now have "Customers" that belong to "Groups", and it is the groups that have messages associated with them.

So, if we are going to bring our app in tune with the new reality, we'll have to start by revising our domain model:

    1 from peak.api import *
    2 
    3 
    4 class Group(model.Element):
    5 
    6     class name(model.Attribute):
    7         referencedType = model.String
    8 
    9     class text(model.Attribute):
   10         referencedType = model.String
   11 
   12 
   13 class Customer(model.Element):
   14 
   15     class name(model.Attribute):
   16         referenceType = model.String
   17 
   18     class group(model.Attribute):
   19         referenceType = Group

The most interesting thing to notice about this new model is that the reference type for the group attribute of our Customer is actually the Group element we define. When we use this model, we're going to actually store references to Group objects in the group attribute of the Customer objects. But we'll do it the PEAK way.

As long as we're rewriting anyway, we might as well get rid of that clunky file and move to a real database for our messages. Perhaps someday they'll move into a table on the corporate database, but for now we'll stick with our SQLite database, since has proven its worth so far.

We'll replace the messagefile configuration item in our hello file with one for the new database:

messagedb = naming.LinkRef('sqlite:messages.db') 

Our hello_storage.py file will see the greatest amount of change. Let's start with the simpler class, the Data Manager for the Customers. This data is read only, so we can use QueryDM here:

    1 from peak.api import *
    2 from hello_model import Customer, Group
    3 
    4 class CustomerDM(storage.QueryDM):
    5 
    6     defaultClass = Customer
    7     customerdb = binding.Obtain(PropertyName('corporate.customerdb'))
    8     GroupDM = binding.Obtain(storage.DMFor(Group))
    9 
   10     def _load(self, oid, ob):
   11         row = ~self.customerdb("select GRP from custgroups where NAME='%s'" %
   12             oid)
   13         group = self.GroupDM[row.GRP]
   14         return {'name': oid, 'group': group}
As you can see, there's some interesting stuff happening here.

First, we're using the binding system to pick up a reference to the Data Manager for the group. Even though we don't define that until later in the file, we don't have a reference problem, because the dereference doesn't happen until runtime.

Second, we're returning an instance retreived from the GroupDM as the value of one of our attributes. This kind of arrangement is why Data Managers return ghosts: when we access a Customer object, even though it contains a reference to a Group object, no system activity (ie: reading from databases) happens because of this until we actually try to reference an attribute of the Group. Thus in PEAK inter-object links are not expensive, even when the other objects are backed by potentially slow external databases and/or occupy large amounts of memory when active.

OK, here's the GroupDM, which is not any less complicated than our old MessageDM from before we added the customers, but which shows how to use SQL to do the same stuff:

    1 class GroupDM(storage.EntityDM):
    2 
    3     defaultClass = Group
    4     messagedb = binding.Obtain(PropertyName('helloworld.messagedb'))
    5 
    6     def _load(self, oid, ob):
    7         row = ~self.messagedb("select text from messages where name='%s'" %
    8             oid)
    9         return {'name': oid, 'text': row.text}
   10 
   11     def _new(self, ob):
   12         if ob.name in self:
   13             raise KeyError, "%s is already in the database" % ob.name
   14         self._save(ob)
   15 
   16     def _save(self, ob):
   17         self.messagedb(("insert or replace into messages (name, text) values "
   18             "('%s', '%s')") % (ob.name, ob.text))
   19 
   20     def __contains__(self, oid):
   21         #using where would be more efficient here, but I don't
   22         #know how to test the cursor to see if I got any results
   23         for row in self.messagedb('select name from messages'):
   24             if row.name==oid: return 1
   25         return 0

Now, of course, our application needs to use the new domain objects:

    1 from peak.api import *
    2 from hello_model import Customer, Group
    3 
    4 class HelloWorld(commands.Bootstrap):
    5 
    6     usage = """
    7 Usage: hello command arguments
    8 
    9 Available commands:
   10 
   11     for -- sets a greeting
   12     to  -- displays a greeting
   13 """
   14 
   15     CustomerDM = binding.Make('hello_storage:CustomerDM',
   16         offerAs=[storage.DMFor(Customer)])
   17     GroupDM = binding.Make('hello_storage:GroupDM',
   18         offerAs=[storage.DMFor(Group)])
   19     #Does this actually do anything useful?:
   20     CustomerDB = binding.Obtain(PropertyName('corporate.customerdb'),
   21         offerAs=[PropertyName('corporate.customerdb')])
   22     MessageDB = binding.Obtain(PropertyName('helloworld.messagedb'),
   23         offerAs=[PropertyName('helloworld.messagedb')])
   24 
   25 
   26 class toCmd(commands.AbstractCommand):
   27 
   28     usage = """
   29 Usage: hello to <name>
   30 
   31 Displays the greeting for "name".
   32 """
   33 
   34     Customers = binding.Obtain(storage.DMFor(Customer))
   35 
   36     def _run(self):
   37         if len(self.argv)<2: raise commands.InvocationError("Missing name")
   38         storage.beginTransaction(self)
   39         print >>self.stdout, self.Customers[self.argv[1]].group.text
   40         storage.commitTransaction(self)
   41 
   42 
   43 class forCmd(commands.AbstractCommand):
   44 
   45     usage = """
   46 Usage: hello for <group>: <greeting>
   47 
   48 Stores "greeting" as the greeting message for group "group".
   49 """
   50 
   51     Groups = binding.Obtain(storage.DMFor(Group))
   52 
   53     def _run(self):
   54         if len(self.argv)<2: raise commands.InvocationError("Missing arguments")
   55         parts = ' '.join(self.argv[1:]).split(':')
   56         if len(parts)!=2: raise commands.InvocationError("Bad argument format")
   57         groupname = parts[0].strip(); message = parts[1].strip()
   58         storage.beginTransaction(self)
   59         if groupname in self.Groups:
   60             group = self.Groups[groupname]
   61         else:
   62             group = self.Groups.newItem()
   63             group.name = groupname
   64         group.text = message
   65         storage.commitTransaction(self)

The changes here are relatively small, almost trivial. The biggest change is that we are getting the message from the text attribute of the group attribute of a Customer object. The changes in for are mostly to the usage string and some variable name changes, just to be consistent.

If we test this program, though, we'll quickly find a design error:

% ./hello for vip:"I am so happy to see you, %s" 
% ./hello to Jeff 
I am so happy to see you, %s 
Woops, we forgot about our tailored messages.

How to solve this? Well, clearly a tailored message is more a property of the Customer proper. So we can add something to our domain model: a greeting function. And we can take advantage of this to deal with the case where our bosses are serious about not wanting tailored messages. We'll allow for messages that don't have a substitution point:

%!python 
class Customer(model.Element): 
 
    class name(model.Attribute): 
        referenceType = model.String 
 
    class group(model.Attribute): 
        referenceType = Group 
 
    def greeting(self): 
        if '%' in self.group.text: 
            return self.group.text % self.name 
        else: return self.group.text 

Now our print line in helloworld.py becomes:

print >>self.stdout, self.Customers[self.argv[1]].greeting() 
Which is more Demeter-proof anyway.

And now everyone is happy:

% ./hello to Jeff 
I am so happy to see you, Jeff 
% ./hello for vvip:"Greetings, Your Excelency!" 
% ./hello to Jackie 
Greetings, Your Excelency! 

Up: IntroToPeak Previous: IntroToPeak/LessonThree Next: IntroToPeak/LessonFive


PythonPowered
EditText of this page (last modified 2003-12-09 22:02:10)
FindPage by browsing, title search , text search or an index
Or try one of these actions: AttachFile, DeletePage, LikePages, LocalSiteMap, SpellCheck