Up: IntroToPeak Previous: IntroToPeak/LessonOne Next: IntroToPeak/LessonThree

Now it's time to go a tiny step beyond the triviality of the "Hello, world!" example. In this chapter I'll expand the example to handle saying hello to an arbitrary variety of things. Since we want to be flexible. we'll get the greeting message for each thing from a database table. Well, really just a flat file, but that's so we don't get distracted with SQL connections and all that just yet.

In a "real" application, we'd be using SQL or some other robust storage as our "back end". But here we want to focus on the issues of creating an "object-relational mapping", without getting bogged down in the relational part. Unfortunately, this makes the tutorial a bit lopsided, because we'll be building a sophisticated object-relational mapping over data that would've been trivial to use in its original form!

So, try to ignore that part, and focus on the ideas, which scale up to vastly larger applications than what we're showing here.

Contents

Lesson Two: Domain Models and Data Retrieval

Command Arguments
Application Structure: Moving to a Package
Domain Models
Data Managers

Loading data from the flat file
Building the Data Manager subclass

Putting it all Together
Points to Remember

To greet more than one thing, we'll need to be able to tell our command what to greet. Let's rename our command to hello, and give what we want to say hello to as the command argument, e.g.:

]]>

If all we want to do is say hello in the same boring way every time, we could revise our helloworld.py file as follows:

1 from peak.api import * 2 3 class HelloWorld(commands.AbstractCommand): 4 5 message = binding.Obtain(PropertyName('helloworld.message')) 6 7 def _run(self): 8 print >>self.stdout, self.message % self.argv[1]]]>]]>

Note the use of self.argv to get access to the command arguments.

The corresponding hello file would look like this:

]]> Now we'll get something like this:

]]>

We're about to complicate our code. In this application, everything is simple enough we could keep everything in a single helloworld.py file. However, in a PEAK application of any non-trivial size, you probably won't want to do that. Instead, it generally works best to group parts of the application that are logically related into separate files. We'll use that structure for our example even though it will seem a bit silly because of the small amount of code involved. But doing it this way will have the advantage that our configuration exmaples will look more like what you'd use in a full blown application.

So, we'll create a directory `helloworld' that will hold our application modules. This directory will then need to be on the python path, and have an __init__.py file so that it is a python package. Since we now have a package called helloworld, it is redundent to have a helloworld.py module. Let's rename that module to something more descriptive of its contents. We'll call it commands.py.

To accomodate this change to a package, we need to change our hello ini file as follows:

]]>

So, from this point forward we'll assume you have a properly set up package directory, and you've put that directory in your PYTHONPATH via one of the methods described in the previous chapter.

When working with application data, PEAK uses the concept of a "domain model" for that data. A "domain model" describes the kinds of "problem domain" (i.e. "real-world") objects you're working with, and their relationships to other kinds of objects. In PEAK-speak, the objects are called "Elements", and the relationships are called "Features". Features are implemented as object attributes, using custom descriptors. See the GraphvizTutorial for more info on Domain Models.

To facilitate grabbing data from our database, we'll define a simple Element class, in a model.py file in our helloworld package:

1 from peak.api import * 2 3 class Message(model.Element): 4 5 class forname(model.Attribute): 6 referencedType = model.String 7 8 class text(model.Attribute): 9 referencedType = model.String]]>]]>

Here Message is our Element, the thing we are going to load from our database file. forname and text are attributes our Message objects will have once loaded. I think their purposes should be pretty obvious.

You probably noticed immediately that we are defining classes inside classes here. The nested classes actually create attribute descriptors, similar to the Python built-in property type. However, instead of having to define functions and then wrap them into property object, we can simply subclass a predefined "Feature" type such as model.Attribute, and provide parameters such as referencedType, or define methods to control the feature's behavior.

(Note, by the way, that referencedType does not necessarily refer to the class of the objects or values that will be stored in the attribute. It can also reference an object like model.String, that simply provides metadata describing what values are acceptable for the attribute. For more extensive examples of using Model types, see the bulletins example in the examples directory of the PEAK CVS tree.)

The domain model by itself is simply a schema, perhaps with some behavior. (For example, we might add a hello() method to our Message class, so that an instance of Message could actually deliver its message directly.)

But, a domain model by itself doesn't know anything about storage. (This is so that we can reuse the domain model with different kinds of storages.) To store and retrieve instances of our domain model classes, we need a Data Manager. Data Managers are responsible for loading and storing the data described by the domain model classes.

For the present example we're only interested in loading data from a "database table". So we'll subclass QueryDM, a peak.storage class that provides a read-only interface to a datastore:

]]> This class is another abstract class that has to be specialized for our intended use. Specifically, we have to add the code that does the actual reading and writing of data from the model attributes, to and from the external datastore.

To keep our example simple, we'll use a flat file as our external data store. In keeping with PEAK design principles, we won't hardcode the filename into our Data Manager, but will instead make it configurable:

]]> Obviously, we'll now need a different line in our hello configuration file::

]]> Here we've used another PEAK function: config.fileNearModule() will construct an appropriately qualified filename for the second argument based on the assumption that it is in the same directory as the module named by the first argument. So, messagefile will be the path to the hello.list file, located in our helloworld package directory (a package is also a module from python's point of view). In a real application you probably wouldn't keep a database file in your package directory, but it's convenient for us to do so in this example to keep all the files together.

Since we're only using a QueryDM in this example, we only have to worry about reading data from the datastore, not writing it (That's a later example). To specify how to do this, we override the _load method of QueryDM. Our _load method needs to return a dictionary of names and values, which will get used through a __setstate__ call to load the data into our Message instances.

Now we need to whip up a data format for our messagefile. Let's have the thing we are saying hello to be first, and the actual message second, separated by a "|" character.

So we'll create a hello.list file like this:

]]> (Forgive my feeble attempts at Deutsch.)

Because this is going to be a read-only file, we're going to cheat and load the file only once, the first time it's used. We'll use another peak.binding tool to accomplish this:

1 def data(self): 2 data = {} 3 file = open(self.filename) 4 for line in file: 5 fields = [field.strip() for field in line.split('|',1)] 6 forname, text = fields 7 data[forname] = {'forname': forname, 'text': text} 8 file.close() 9 return data 10 11 data = binding.Make(data)]]>]]>

binding.Make is similar to binding.Obtain, in that it's used inside a class body to create a property-like descriptor for the class' instances. It's different, in that it takes a function as its argument, rather than a configuration key. The function should take at least one parameter (self), and return the value to be used for an attribute. In this way, it's very similar to the property built-in, but with a key difference: a property's fget function is called every time it is used, but the result of a binding.Make function is cached and reused for subsequent accesses of the attribute.

So, here's what will happen. The first time an instance of our QueryDM subclass accesses its data attribute, the function above will be called, and the result stored in the instance's data attribute. It will then be immediately available for use, and won't be computed again for that instance unless the attribute is deleted.

Of course, in the case of our current hello program, we'll only ever make one query on the database. If we were going to make a longer-running program, or allow the database to be modified, using this sort of caching might be a bad idea. However, this design decision affects only our data manager's implementation, and not the rest of the application. Our main, command-line application will not be affected, and neither will our Message class, if we decide to change how or where the messages are stored.

Here's the complete contents of the last new file we need for our expanded hello application, the storage.py file. This also adds the _load method to the QueryDM:

1 from peak.api import * 2 from helloworld.model import Message 3 4 class MessageDM(storage.QueryDM): 5 6 defaultClass = Message 7 filename = binding.Obtain(PropertyName('helloworld.messagefile')) 8 9 def data(self): 10 data = {} 11 file = open(self.filename) 12 for line in file: 13 fields = [field.strip() for field in line.split('|',1)] 14 forname, text = fields 15 data[forname] = {'forname': forname, 'text': text} 16 file.close() 17 return data 18 19 data = binding.Make(data) 20 21 def _load(self, oid, ob): 22 return self.data[oid]]]>]]>

defaultClass specifies the class that will be used to instantiate objects retreieved from this Data Manager. In our case, that's Message from our model class. binding.Obtain you've met before, so its purpose here should be obvious.

A data manager is like a container for application objects. It's keyed by the notion of an oid: an "object ID" for objects of this kind. So, when we use a MessageDM instance, we'll retrieve objects from it like this:

]]>

When we do this, the MessageDM will return what's called a "ghost". It will be an instance of Message that contains no data, but knows its object ID, and knows that it's not yet loaded. As soon as we try to use the Message (by accessing any attributes or methods), it will "phone home" to the data manager it was retrieved from, asking for its data to be loaded.

At this point, the MessageDM._load() method is going to get called. It'll be given the object ID that was used to access the object originally (the oid parameter), and the applicable "ghost" object (ob). The data that _load() returns will be used to fill in the ghost's instance dictionary so that it will become a "real" object, and the attribute access that triggered the _load call can finally be satisfied.

(Notice, by the way, that if we were using a relational database, the _load method is probably where we'd put an SQL query to retrieve the data for the given object ID.)

All that remains, then, is to use our new data manager from our main application program:

1 from peak.api import * 2 from helloworld.model import Message 3 4 class HelloWorld(commands.AbstractCommand): 5 6 Messages = binding.Make( 7 'helloworld.storage.MessageDM', 8 offerAs=[storage.DMFor(Message)] 9 ) 10 11 def _run(self): 12 storage.beginTransaction(self) 13 print >>self.stdout, self.Messages[self.argv[1]].text 14 storage.commitTransaction(self)]]>]]>

As you can see, our main program has stayed fairly simple, despite the additional complexity of using a database. (And in case you think "using a database" is an inflated way of refering to a flat file, observe that we can replace the simple flat file with access to something like an SQL database, simply by changing the _load method in our data manager.)

In our revised HelloWorld class, we see another use of binding.Make, this time taking an import string that specifies a class that should be instantiated. Previously, we used binding.Make with a function, but it also accepts classes, or strings that say where to import classes or functions from. (Indeed, it takes anything that implements or adapts to the binding.IRecipe interface, but that's more than you need to know right now.)

When used with a class (or an import string that names a class), binding.Make will call it once, the first time the named attribute is used. In this case, that means that it will automatically create a new MessageDM as the Messages attribute of the HelloWorld instance it's contained in. So, in effect, we are declaring that each HelloWorld instance should have its own MessagesDM instance, stored in its Messages attribute.

Here we also use an additional keyword argument (offerAs) to binding.Make. offerAs is a list of "configuration keys" under which the created component will be "offered" to child components (via the configuration system). In this case, we're saying that any child components of a HelloWorld instance should use its Messages attribute, if they are looking for a data manager that provides storage services for Message instances.

The PEAK configuration system offers many kinds of "configuration keys" under which components or configuration properties can be found. We've previously worked with PropertyName, which is one kind of configuration key. And here we work with another, storage.DMFor(), that creates a configuration key denoting a data manager for a particular element type. PEAK does not limit you to using its predefined kinds of configuration keys, however. You can also create your own key types for specialized purposes, by implementing the config.IConfigKey interface.

Once you "offer" an attribute as the source of a configuration key, it can then be referenced by other uses of binding.Obtain by child components. For example, if we had another class that needed to use a data manager for Message instances, we could add something like this to that class:

]]>

This would use the configuration system to find an appropriate data manager. And, if an instance of this other class were contained within an instance of HelloWorld, it would use the HelloWorld object's Messages attribute to fill its own Messages attribute. (Note that the similarity in names has nothing to do with how it works; we could have called one of the attributes "foobar" and it would make no difference.)

The big advantage of this is that it allows us to create "loosely coupled" reusable components. A component that needs some service can simply Obtain it via a suitable key, and a higher-level application component can "offer" an appropriate service implementation. What's more, even binding.Obtain can use the offerAs argument, thereby specifying that an obtained component will be available to the containing component's children, perhaps under another configuration key as well as the key that was used to obtain the service.

Anyway, once we've got the data manager to be used for Message instances, we can look up any Message instance in it by using the instance's object id (in this case, its forname) as a key. In our case, that key is the string passed in as an argument to our command (self.argv[1], supplied to us by the peak script). As previously discussed, this gives us a Message instance, which we can then display by printing its text attribute.

Our last addition to HelloWorld is the use of beginTransaction and commitTransaction to enclose our data manipulation in a transaction. For our simple read-only application here, a transaction is of relatively little importance, but transaction management will matter a great deal when you use read/write databases.

Anyway, our program now works like this:

]]>

Let's recap some of the key topics we've covered in this lesson:

Bindings

Bindings are attribute descriptors, like property, but execute only once per instance unless the attribute is deleted.
binding.Obtain obtains a configuration value or component by searching parent components for "offered" values.
binding.Make invokes a function or class constructor (possibly after importing them) to create an attribute value.
Bindings can have an offerAs keyword, that "offers" the attribute under one or more "configuration keys", that then can be looked up by child components using Obtain.

Domain Model

Elements are persistent "problem domain" objects
Features are attributes representing a relationship between elements and other elements or values. They're defined using classes nested within the element class, and they specify (among other things) the type of related element or value to be used.

Data Managers

A data manager is a "virtual container" for elements, that retrieves and stores their data for them
Data managers initially return "ghost" elements, then populate them with data when actually used. (So that retrieving one element doesn't cause all related elements to be retrieved at once.)
By defining the _load method in a data manager subclass, you can implement whatever data retrieval is needed to populate an element
Data managers should be used within the scope of a transaction

Commands

An AbstractCommand receives its arguments in self.argv, a list similar to sys.argv

Whew! That's a lot of things, but we're still only scratching the surface. Each of the major topic areas listed above could be expanded into entire tutorials of their own. But, for now, you may want to simply experiment a bit with what you've seen so far, before delving deeper into the PEAK API documentation and source code.

Up: IntroToPeak Previous: IntroToPeak/LessonOne Next: IntroToPeak/LessonThree