[TransWarp] Handling names, addresses, and rename operations in peak.naming
Phillip J. Eby
pje at telecommunity.com
Wed Dec 4 00:22:59 EST 2002
Just some late-night thoughts sorting out some remaining conceptual issues
in peak.naming... If you have trouble following me, don't worry, I'm
probably having trouble too; that's why I'm writing this. :)
A "naming authority" is a system that knows how to resolve a name. In an
HTTP URL, the "http://xyz/" part of the URL is the naming authority. The
rest of the URL before the query string or fragment ID is the "name"
portion of the URL.
SMTP URLs are addresses. They denote only a naming authority, but no
actual name within that authority.
LDAP URLs are names; they contain both a naming authority and a path.
A path doesn't have to be multi-part, even conceptually, to be considered a
name - a flat namespace is still a namespace.
The grey area lies with things like our database URLs, which often refer to
a database server as a naming authority, and then reference a
database. Should these be considered addresses, because they name a
specific database, or names, because they are resolved by the naming authority?
Intuitively, I want to say that a DB URL is an address, because there is no
indirection taking place. Yet a "file:" URL seems to be a name, and there
isn't necessarily any indirection there. Of course, there *could* be
indirection in a file system.
Perhaps the real question to whether something is a name or address is
whether the naming authority by itself could be usefully considered a
naming context. Could you retrieve items from it? Put new items in
it? But then that seems to make DB servers naming contexts, because it's
actually pretty reasonable to think that you could add or remove databases
from the database server, or even list the current databases, if you had
appropriate access.
It seems that all names can be represented with a simple two-level
structure. The outer structure is a CompositeName, whose first element is
an optional naming authority, specified by a URL up through any naming
authority portions. Subsequent elements are paths within each respective
naming system. Thus it's possible to describe a name such as:
ldap://somewhere.com/cn=thingy,ou=documents,o=somewhere.com/imagesDir/anImage.gif
Which would be represented as:
naming.CompositeName(
[ ldapURL('ldap', '//somewhere.com/'),
naming.CompoundName(
[ 'o=somewhere.com', 'ou=documents', 'cn=thingy']
),
naming.CompoundName(
[ 'imagesDir', 'anImage.gif' ]
),
]
)
(Of course, this translation should be internal to the naming system;
you'll never see this as a user, only possibly as an implementer of a
namespace or maybe even only as a PEAK internals developer.)
Anyway, the idea of the name shown is that it's a composite name that goes
through two naming systems: first LDAP, and then something else. That
something else might be HTTP, FTP, or maybe even the local
filesystem. Perhaps it's something else we haven't thought of
yet. There's a reference in the LDAP directory that tells us where to
go. No problem.
It seems to me that every context should know its full CompositeName. For
a lot of our trivial contexts, this will be something like
'naming.CompositeName(['somescheme:']), and that's it. That's
okay. Anyway, if every context knows its full name, then we can do some
important things for "rename()" operations. A rename needs to find a
context whose full name is a prefix of both the origin and destination
names. The most reliable way to do this is to perform a "resolve()" on
both names, and then look for the common prefix of the names of the
contexts returned. If it happens to be the same as one of the contexts, so
much the better, otherwise we have to resolve the newly computed prefix
again. One constraint: the outer composite names must match in every
element before the last, otherwise the source and target names are in
different naming systems. But we can't tell this until *after* we resolve
them, because we don't know ahead of time where symlinks, referrals,
cross-system thunks, and other goofy things might come into play. What if
our naming system supports traversing into zipfiles, for example? Sheesh.
Okay, so it might be a little inefficient. If you want fast renames,
you'll just have to lookup a common context first, and not do something like:
naming.rename('a:veryverylongURL','a:veryveryveryverylongURL')
This of course assumes that we are going to add context-less functions for
all the standard naming operations. It certainly could be handy for things
like:
naming.bind("file:///somefile", documentToSave)
and better, letting you implement a "Save as:" function in your editor that
supports arbitrary ways to do so:
naming.bind("ftp://somehost/somefile", documentToSave) # FTP upload
naming.bind("http://somehost/somefile", documentToSave) # WebDAV PUT
Or the piece de resistance...
naming.bind("file:///aZipFile.zip/someInternalPath/somefile", documentToSave)
Okay. So what do we need to add to IBasicContext? A method to get the
"full name". I guess we could call it the same thing they use in JNDI:
getNameInNamespace(). So far so good. We need a default implementation in
AbstractContext and GenericURLContext. Don't know how to do that
yet. None of our current context subclasses deal with hierarchy; none of
them really represent a "place" as such; if you create different instances,
they're really all the same "place". So the default implementation I guess
could just say it's at "myscheme:" as the first part of the composite name,
and the rest would be empty. I guess any new context we create that's
hierarchical, will either need to pass into its children what their names
are, and/or the children will need to look up to the parents. But, the
child would have to know what to add to the parent, so it's better for the
parent to name the child. This is generally how it works with humans, so
the algorithm is known to work. :)
All the object factory methods (e.g. address.retrieve(), IObjectFactory,
etc.) receive both a name and the parent context, so they could actually
assemble this information. We even have a field already reserved for a
component name to be passed to a child constructor, so we could use
that. It would need to be guaranteed to be a CompoundName, though, if we
don't want a getComponentPath() operation to be confused. Yecch. Better
to have a keyword argument - perhaps 'nameInContext'. Yeah. Now if every
context carries a 'namingAuthority' as well, we can in principle compose a
local name simply by walking backwards up our parents with the same naming
authority, concatenating compoundNames as we go. We want to use compound
names at each level rather than a single element at each level, so that
it's not necessary for a context to create a bunch of intermediate contexts
when somebody looks up say, 'foo/bar/baz' in a file:// context. We'd
rather it simply create the 'foo/bar' context within itself.
Hm. What's funny about this is, it suddenly seems as though there's no
longer a reason to have multiple parts to an "outer" composite name, apart
from a naming authority and a single compoundName path. That's because if
you cross over from one naming system to another, there's no point in
retaining the path that carried you over. A nice simplification.
So what features are needed to do this? We need bindings for
namingAuthority and nameInContext. namingAuthority should be based off of
the default URL scheme for the context class, or the applicable URL scheme
for the GenericURLContext instance. For a "nested" contexts, such as a
zipfile context that lives in a file system, the context should use its
container's namingAuthority. nameInContext should default to an empty
compound name. When a context factory such as an IObjectFactory returns a
subcontext, or if the context itself creates or returns a subcontext, it
should pass in a 'nameInContext' parameter to the new context to tell it
where it "lives" in relation to its parent. getNameInNamespace() is then a
simple upward walk, gathering compoundName objects as long as the
namingAuthority remains the same.
The namingAuthority for hierarchical contexts in general should default to
that of the parent. It's only when a URL or other address mechanism is
used to jumpstart a new naming system, that namingAuthority should be
reset. This suggests that URL classes used to retrieve a naming context
should set namingAuthority when they retrieve the context. This could be
implemented in the _resolveURL() method of such a URL context.
Okay, new twist - what happens with "symlinks" or LinkRef()s within a
naming system? Simply concatenating names traversed could get quite
silly. Likewise if somebody goes to "foo/../bar/../baz/../spam". We'd
rather see the resulting context call itself "spam" instead of all the
other garbage.
Alright, here's how to fix that. We make namingAuthority "None" by
default, and it's only set by URL retrieval or URL context factories -- or
when we want to give an object an "absolute" name. Now we implement
getNameInNamespace() as an upward walk until a non-None namingAuthority is
found. Good; that makes a lot of cases simpler. Now, children never have
to mess with their own names in any way, either their parents or
"godparents" (URL.retrieve()/object factory methods) give them their names.
Hm. So what's left to figure out? We need something to get a common
prefix of two names. But it's only actually needed for compound names,
since we now know that our hypothetical "absolute composite name" only ever
has a naming authority and a compound name. If the authorities don't match
between two names, you can't rename one to the other. So we don't even
need getNameInNameSpace() to return a true "name"; really it should just
return an '(authority,compoundName)' tuple.
URL objects should support extracting a naming authority. Naming
authorities should hashable and comparable, and equivalent authorities
should hash and compare equally. (This implies that they must be in
canonical form; e.g. default port numbers filled in, etc.) Ideally, there
should also be a way to go from a naming authority and a compound name,
back to a URL. The precise mechanics of this require further thought.
Tomorrow. :)
More information about the PEAK
mailing list