[TransWarp] XMI Peristence Requirements
Phillip J. Eby
pje at telecommunity.com
Sun Jun 30 16:55:35 EDT 2002
One unusual application of the jars model is persistence in XML form. One
could create a jar which took an XML document as a parameter, and then
returned persistent objects from it. I'd like to use this to resurrect
TransWarp's XMI support in a more-usable form. Since XMI can be used to
represent data according to any model which can be expressed in terms of
the MOF, it's more than adequate to represent application data for
persistence. XMI is also a good format for metadata interchange,
especially with today's UML tools.
Why use XMI as a persistence format? I see it as a vehicle for the
following, in decreasing priority order:
1. Expressing metadata: data models, object models, workflow models, etc.
2. Permitting easy tests of domain-level operations against data that can
be read and edited by humans, as well as easily compared against "correct"
result texts.
3. An XML pickling format for archiving or moving objects or transactions
between databases.
So here's what I'd like see PEAK support in the way of XMI persistence
capabilities:
CRITICAL: support XMI 1.1, the current standard, which is *much* easier for
humans to read and edit than XMI 1.0.
CRITICAL: retrieve model elements from an XMI file using persistence and
ghosts, so that an entire model need not be immediately marshalled into
objects from the XML.
CRITICAL: the mapping between the objects to be loaded and the XMI format
should be specifiable in terms of a MOF model loaded from another XMI
document. That is, we shouldn't have to custom-write mappings for UML and
CWM documetns, since they have XMI-format MOF models. And any new UML
structural models created for applications can be translated to MOF models
and hence saved as an XMI-format model. In other words, we shouldn't have
to hand-write XMI persistence mappings ever, except for maybe one to
bootstrap the MOF metamodel in the first place.
CRITICAL: support saving modified objects as a new XMI document, with
whitespace and formatting optional.
CRITICAL: support the XMI document itself being a persistent object whose
state (text) is saved in another database, such as ZODB, an RDBMS, or even
in the filesystem. (This actually shouldn't require any special actions,
as long as the XMI document object itself is written to be
persistent. Committing the objects loaded from the XMI document will
modify the XMI document, causing it to get committed as well, which will
write it to the underlying DB, which will then get committed...)
IMPORTANT: support saving as a modified in-place XMI document, with any
vendor XMI extensions/annotations remaning in place.
HELPFUL: support saving a modified in-place XMI document with all comments,
existing whitespace, etc. intact.
HELPFUL: support XMI 1.0 (this might move up to CRITICAL if the UML tools
Ty and I use don't end up supporting XMI 1.1 very well).
NICE-TO-HAVE: support external references to model elements outside the XMI
document, via appropriate plug-ins to look them up with.
NICE-TO-HAVE: support writing a commit transaction as an "XMI.diff"-format
file, and support reading and applying it to the *objects* (not the XMI
document) it refers to. This would give us a kind of domain-level
"transaction log" or "record and play back commands" capability. But it'll
probably be a long while before we can even figure out *how* to do this.
Open issues in the design:
* Some kind of "metamodel registry" is needed, so the XMI jar can find the
right metamodel mapping, based on the header info in the XML file.
* What data structure should be used for the document itself? One of the
Python DOMs (big and slow)? The RXP "lightweight" DOM (small and
superfast, but lacking in backpointers, and I believe it drops out comments
and whitespace)? A custom construct of our own, based on SAX/SOX?
* What should be used as persistent ID's? XMI elements aren't required to
have an ID, so both the XMI.id and XMI.uuid attributes can only be used as
alternate keys. XPath is probably too heavy, Python id() would only work
if we kept some kind of mapping back to the nodes, and using the nodes
themselves as ID's won't work for the RXP lightweight DOM if we need to get
to a parent object. It's probably important to note that it's not just XML
elements that can represent an object; Attributes in XMI can effectively
represent a PersistentList that has to be considered an object with a
persistent ID for our purposes. That is, attribute nodes need to be
referenceable too.
* Reference counting/GC. Objects are written in an XMI file as nested
within their containing objects if the relationship is compositional, or on
a first-reference basis if they have no composition container. This means
that if the relationship that caused them to be written in full in the
source document, is removed in the transaction, the object's XML elements
must be moved to another location, if one exists, or deleted if no other
reference to the object exists. If we are saving objects by creating a new
document, this isn't a big deal because we'll just write what we need. If
we're trying to simply *modify* a DOM rather than writing a new document,
we'll need to know where all the references to an item are and do some
serious editing.
* The precise nature of the metamodel mapping. I have some uncertainties
right now about whether an XMI DTD can be unambiguously decoded using data
which is strictly in the MOF model from which the DTD was generated. It
seems to me there are places where the implementer of the DTD has a choice
in how to express items. :( This is not a problem for our own DTD's and
custom metamodels, but is an issue in reading and writing intelligible UML
models for tools that might only be able to handle reading something that's
written the way they write it, and not another way that's perfectly valid
according to spec. :(
Summary: It looks like there are many issues that need research before work
on the design itself can proceed. I need to look more thoroughly at both
the DOM implementations that are out there, and the XMI specs.
More information about the PEAK
mailing list