PkgResources |
UserPreferences |
The PEAK Developers' Center | FrontPage | RecentChanges | TitleIndex | WordIndex | SiteNavigation | HelpContents |
The pkg_resources module distributed with setuptools provides an API for Python libraries to access their resource files, and for extensible applications and frameworks to automatically discover plugins. It also provides runtime support for using C extensions that are inside zipfile-format eggs, support for merging packages that have separately-distributed modules or subpackages, and APIs for managing Python's current "working set" of active packages.
Eggs are a distribution format for Python modules, similar in concept to Java's "jars" or Ruby's "gems". They differ from previous Python distribution formats in that they are importable (i.e. they can be added to sys.path), and they are discoverable, meaning that they carry metadata that unambiguously identifies their contents and dependencies, and thus can be automatically found and added to sys.path in response to simple requests of the form, "get me everything I need to use docutils' PDF support".
The pkg_resources module provides runtime facilities for finding, introspecting, activating and using eggs and other "pluggable" distribution formats. Because these are new concepts in Python (and not that well- established in other languages either), it helps to have a few special terms for talking about eggs and how they can be used:
(For more information about these terms and concepts, see also this architectural overview of pkg_resources and Python Eggs in general.)
A namespace package is a package that only contains other packages and modules, with no direct contents of its own. Such packages can be split across multiple, separately-packaged distributions. Normally, you do not need to use the namespace package APIs directly; instead you should supply the namespace_packages argument to setup() in your project's setup.py. See the setuptools documentation on namespace packages for more information.
However, if for some reason you need to manipulate namespace packages or directly alter sys.path at runtime, you may find these APIs useful:
Although by default pkg_resources only supports namespace packages for filesystem and zip importers, you can extend its support to other "importers" compatible with PEP 302 using the register_namespace_handler() function. See the section below on Supporting Custom Importers for details.
The WorkingSet class provides access to a collection of "active" distributions. In general, there is only one meaningful WorkingSet instance: the one that represents the distributions that are currently active on sys.path. This global instance is available under the name working_set in the pkg_resources module. However, specialized tools may wish to manipulate working sets that don't correspond to sys.path, and therefore may wish to create other WorkingSet instances.
It's important to note that the global working_set object is initialized from sys.path when pkg_resources is first imported, but is only updated if you do all future sys.path manipulation via pkg_resources APIs. If you manually modify sys.path, you must invoke the appropriate methods on the working_set instance to keep it in sync. Unfortunately, Python does not provide any way to detect arbitrary changes to a list object like sys.path, so pkg_resources cannot automatically update the working_set based on changes to sys.path.
Create a WorkingSet from an iterable of path entries. If entries is not supplied, it defaults to the value of sys.path at the time the constructor is called.
Note that you will not normally construct WorkingSet instances yourbut instead you will implicitly or explicitly use the global working_set instance. For the most part, the pkg_resources API is designed so that the working_set is used by default, such that you don't have to explicitly refer to it most of the time.
The following methods of WorkingSet objects are also available as module- level functions in pkg_resources that apply to the default working_set instance. Thus, you can use e.g. pkg_resources.require() as an abbreviation for pkg_resources.working_set.require():
Ensure that distributions matching requirements are activated
requirements must be a string or a (possibly-nested) sequence thereof, specifying the distributions and versions required. The return value is a sequence of the distributions that needed to be activated to fulfill the requirements; all relevant distributions are included, even if they were already activated in this working set.
For the syntax of requirement specifiers, see the section below on Requirements Parsing.
In general, it should not be necessary for you to call this method directly. It's intended more for use in quick-and-dirty scripting and interactive interpreter hacking than for production use. If you're creating an actual library or application, it's strongly recommended that you create a "setup.py" script using setuptools, and declare all your requirements there. That way, tools like EasyInstall can automatically detect what requirements your package has, and deal with them accordingly.
Note that calling require('SomePackage') will not install SomePackage if it isn't already present. If you need to do this, you should use the resolve() method instead, which allows you to pass an installer callback that will be invoked when a needed distribution can't be found on the local machine. You can then have this callback display a dialog, automatically download the needed distribution, or whatever else is appropriate for your application. See the documentation below on the resolve() method for more information, and also on the obtain() method of Environment objects.
Locate distribution specified by requires and run its script_name script. requires must be a string containing a requirement specifier. (See Requirements Parsing below for the syntax.)
The script, if found, will be executed in the caller's globals. That's because this method is intended to be called from wrapper scripts that act as a proxy for the "real" scripts in a distribution. A wrapper script usually doesn't need to do anything but invoke this function with the correct arguments.
If you need more control over the script execution environment, you probably want to use the run_script() method of a Distribution object's Metadata API instead.
Yield entry point objects from group matching name
If name is None, yields all entry points in group from all distributions in the working set, otherwise only ones matching both group and name are yielded. Entry points are yielded from the active distributions in the order that the distributions appear in the working set. (For the global working_set, this should be the same as the order that they are listed in sys.path.) Note that within the entry points advertised by an individual distribution, there is no particular ordering.
Please see the section below on Entry Points for more information.
These methods are used to query or manipulate the contents of a specific working set, so they must be explicitly invoked on a particular WorkingSet instance:
Add a path item to the entries, finding any distributions on it. You should use this when you add additional items to sys.path and you want the global working_set to reflect the change. This method is also called by the WorkingSet() constructor during initialization.
This method uses find_distributions(entry, True) to find distributions corresponding to the path entry, and then add() them. entry is always appended to the entries attribute, even if it is already present, however. (This is because sys.path can contain the same value more than once, and the entries attribute should be able to reflect this.)
List all distributions needed to (recursively) meet requirements
requirements must be a sequence of Requirement objects. env, if supplied, should be an Environment instance. If not supplied, an Environment is created from the working set's entries. installer, if supplied, will be invoked with each requirement that cannot be met by an already-installed distribution; it should return a Distribution or None. (See the obtain() method of Environment Objects, below, for more information on the installer argument.)
Add dist to working set, associated with entry
If entry is unspecified, it defaults to dist.location. On exit from this routine, entry is added to the end of the working set's .entries (if it wasn't already present).
dist is only added to the working set if it's for a project that doesn't already have a distribution active in the set. If it's successfully added, any callbacks registered with the subscribe() method will be called. (See Receiving Change Notifications, below.)
Note: add() is automatically called for you by the require() method, so you don't normally need to use this method directly.
Extensible applications and frameworks may need to receive notification when a new distribution (such as a plug-in component) has been added to a working set. This is what the subscribe() method and add_activation_listener() function are for.
Invoke callback(distribution) once for each active distribution that is in the set now, or gets added later. Because the callback is invoked for already-active distributions, you do not need to loop over the working set yourself to deal with the existing items; just register the callback and be prepared for the fact that it will be called immediately by this method.
Note that callbacks must not allow exceptions to propagate, or they will interfere with the operation of other callbacks and possibly result in an inconsistent working set state. Callbacks should use a try/except block to ignore, log, or otherwise process any errors, especially since the code that caused the callback to be invoked is unlikely to be able to handle the errors any better than the callback itself.
pkg_resources.add_activation_listener() is an alternate spelling of pkg_resources.working_set.subscribe().
Extensible applications will sometimes have a "plugin directory" or a set of plugin directories, from which they want to load entry points or other metadata. The find_plugins() method allows you to do this, by scanning an environment for the newest version of each project that can be safely loaded without conflicts or missing requirements.
Scan plugin_env and identify which distributions could be added to this working set without version conflicts or missing requirements.
Example usage:
distributions, errors = working_set.find_plugins( Environment(plugin_dirlist) ) map(working_set.add, distributions) # add plugins+libs to sys.path print "Couldn't load", errors # display errors
The plugin_env should be an Environment instance that contains only distributions that are in the project's "plugin directory" or directories. The full_env, if supplied, should be an Environment instance that contains all currently-available distributions.
If full_env is not supplied, one is created automatically from the WorkingSet this method is called on, which will typically mean that every directory on sys.path will be scanned for distributions.
This method returns a 2-tuple: (distributions, error_info), where distributions is a list of the distributions found in plugin_env that were loadable, along with any other distributions that are needed to resolve their dependencies. error_info is a dictionary mapping unloadable plugin distributions to an exception instance describing the error that occurred. Usually this will be a DistributionNotFound or VersionConflict instance.
Most applications will use this method mainly on the master working_set instance in pkg_resources, and then immediately add the returned distributions to the working set so that they are available on sys.path. This will make it possible to find any entry points, and allow any other metadata tracking and hooks to be activated.
The resolution algorithm used by find_plugins() is as follows. First, the project names of the distributions present in plugin_env are sorted. Then, each project's eggs are tried in descending version order (i.e., newest version first).
An attempt is made to resolve each egg's dependencies. If the attempt is successful, the egg and its dependencies are added to the output list and to a temporary copy of the working set. The resolution process continues with the next project name, and no older eggs for that project are tried.
If the resolution attempt fails, however, the error is added to the error dictionary. If the fallback flag is true, the next older version of the plugin is tried, until a working version is found. If false, the resolution process continues with the next plugin project name.
Some applications may have stricter fallback requirements than others. For example, an application that has a database schema or persistent objects may not be able to safely downgrade a version of a package. Others may want to ensure that a new plugin configuration is either 100% good or else revert to a known-good configuration. (That is, they may wish to revert to a known configuration if the error_info return value is non-empty.)
Note that this algorithm gives precedence to satisfying the dependencies of alphabetically prior project names in case of version conflicts. If two projects named "AaronsPlugin" and "ZekesPlugin" both need different versions of "TomsLibrary", then "AaronsPlugin" will win and "ZekesPlugin" will be disabled due to version conflict.
An "environment" is a collection of Distribution objects, usually ones that are present and potentially importable on the current platform. Environment objects are used by pkg_resources to index available distributions during dependency resolution.
Create an environment snapshot by scanning search_path for distributions compatible with platform and python. search_path should be a sequence of strings such as might be used on sys.path. If a search_path isn't supplied, sys.path is used.
platform is an optional string specifying the name of the platform that platform-specific distributions must be compatible with. If unspecified, it defaults to the current platform. python is an optional string naming the desired version of Python (e.g. '2.4'); it defaults to the currently-running version.
You may explicitly set platform (and/or python) to None if you wish to include all distributions, not just those compatible with the running platform or Python version.
Note that search_path is scanned immediately for distributions, and the resulting Environment is a snapshot of the found distributions. It is not automatically updated if the system's state changes due to e.g. installation or removal of distributions.
Find distribution best matching req and usable on working_set
This calls the find(req) method of the working_set to see if a suitable distribution is already active. (This may raise VersionConflict if an unsuitable version of the project is already active in the specified working_set.) If a suitable distribution isn't active, this method returns the newest distribution in the environment that meets the Requirement in req. If no suitable distribution is found, and installer is supplied, then the result of calling the environment's obtain(req, installer) method will be returned.
Scan search_path for distributions usable on platform
Any distributions found are added to the environment. search_path should be a sequence of strings such as might be used on sys.path. If not supplied, sys.path is used. Only distributions conforming to the platform/python version defined at initialization are added. This method is a shortcut for using the find_distributions() function to find the distributions from each item in search_path, and then calling add() to add each one to the environment.
Requirement objects express what versions of a project are suitable for some purpose. These objects (or their string form) are used by various pkg_resources APIs in order to find distributions that a script or distribution needs.
Create a Requirement object from a string or iterable of lines. A ValueError is raised if the string or lines do not contain a valid requirement specifier, or if they contain more than one specifier. (To parse multiple specifiers from a string or iterable of strings, use parse_requirements() instead.)
The syntax of a requirement specifier can be defined in EBNF as follows:
requirement ::= project_name versionspec? extras? versionspec ::= comparison version (',' comparison version)* comparison ::= '<' | '<=' | '!=' | '==' | '>=' | '>' extras ::= '[' extralist? ']' extralist ::= identifier (',' identifier)* project_name ::= identifier identifier ::= [-A-Za-z0-9_]+ version ::= [-A-Za-z0-9_.]+
Tokens can be separated by whitespace, and a requirement can be continued over multiple lines using a backslash (\\). Line-end comments (using #) are also allowed.
Some examples of valid requirement specifiers:
FooProject >= 1.2 Fizzy [foo, bar] PickyThing<1.6,>1.9,!=1.9.6,<2.0a0,==2.4c1 SomethingWhoseVersionIDontCareAbout
The project name is the only required portion of a requirement string, and if it's the only thing supplied, the requirement will accept any version of that project.
The "extras" in a requirement are used to request optional features of a project, that may require additional project distributions in order to function. For example, if the hypothetical "Report-O-Rama" project offered optional PDF support, it might require an additional library in order to provide that support. Thus, a project needing Report-O-Rama's PDF features could use a requirement of Report-O-Rama[PDF] to request installation or activation of both Report-O-Rama and any libraries it needs in order to provide PDF support. For example, you could use:
easy_install.py Report-O-Rama[PDF]
To install the necessary packages using the EasyInstall program, or call pkg_resources.require('Report-O-Rama[PDF]') to add the necessary distributions to sys.path at runtime.
Return true if dist_or_version fits the criteria for this requirement. If dist_or_version is a Distribution object, its project name must match the requirement's project name, and its version must meet the requirement's version criteria. If dist_or_version is a string, it is parsed using the parse_version() utility function. Otherwise, it is assumed to be an already-parsed version.
The Requirement object's version specifiers (.specs) are internally sorted into ascending version order, and used to establish what ranges of versions are acceptable. Adjacent redundant conditions are effectively consolidated (e.g. ">1, >2" produces the same results as ">1", and "<2,<3" produces the same results as``"<3"). ``"!=" versions are excised from the ranges they fall within. The version being tested for acceptability is then checked for membership in the resulting ranges. (Note that providing conflicting conditions for the same version (e.g. "<2,>=2" or "==2,!=2") is meaningless and may therefore produce bizarre results when compared with actual version number(s).)
Entry points are a simple way for distributions to "advertise" Python objects (such as functions or classes) for use by other distributions. Extensible applications and frameworks can search for entry points with a particular name or group, either from a specific distribution or from all active distributions on sys.path, and then inspect or load the advertised objects at will.
Entry points belong to "groups" which are named with a dotted name similar to a Python package or module name. For example, the setuptools package uses an entry point named distutils.commands in order to find commands defined by distutils extensions. setuptools treats the names of entry points defined in that group as the acceptable commands for a setup script.
In a similar way, other packages can define their own entry point groups, either using dynamic names within the group (like distutils.commands), or possibly using predefined names within the group. For example, a blogging framework that offers various pre- or post-publishing hooks might define an entry point group and look for entry points named "pre_process" and "post_process" within that group.
To advertise an entry point, a project needs to use setuptools and provide an entry_points argument to setup() in its setup script, so that the entry points will be included in the distribution's metadata. For more details, see the setuptools documentation. (XXX link here to setuptools)
Each project distribution can advertise at most one entry point of a given name within the same entry point group. For example, a distutils extension could advertise two different distutils.commands entry points, as long as they had different names. However, there is nothing that prevents different projects from advertising entry points of the same name in the same group. In some cases, this is a desirable thing, since the application or framework that uses the entry points may be calling them as hooks, or in some other way combining them. It is up to the application or framework to decide what to do if multiple distributions advertise an entry point; some possibilities include using both entry points, displaying an error message, using the first one found in sys.path order, etc.
In the following functions, the dist argument can be a Distribution instance, a Requirement instance, or a string specifying a requirement (i.e. project name, version, etc.). If the argument is a string or Requirement, the specified distribution is located (and added to sys.path if not already present). An error will be raised if a matching distribution is not available.
The group argument should be a string containing a dotted identifier, identifying an entry point group. If you are defining an entry point group, you should include some portion of your package's name in the group name so as to avoid collision with other packages' entry point groups.
Yield entry point objects from group matching name.
If name is None, yields all entry points in group from all distributions in the working set on sys.path, otherwise only ones matching both group and name are yielded. Entry points are yielded from the active distributions in the order that the distributions appear on sys.path. (Within entry points for a particular distribution, however, there is no particular ordering.)
(This API is actually a method of the global working_set object; see the section above on Basic WorkingSet Methods for more information.)
Create an EntryPoint instance. name is the entry point name. The module_name is the (dotted) name of the module containing the advertised object. attrs is an optional tuple of names to look up from the module to obtain the advertised object. For example, an attrs of ("foo","bar") and a module_name of "baz" would mean that the advertised object could be obtained by the following code:
import baz advertised_object = baz.foo.bar
The extras are an optional tuple of "extra feature" names that the distribution needs in order to provide this entry point. When the entry point is loaded, these extra features are looked up in the dist argument to find out what other distributions may need to be activated on sys.path; see the load() method for more details. The extras argument is only meaningful if dist is specified. dist must be a Distribution instance.
Parse a single entry point from string src
Entry point syntax follows the form:
name = some.module:some.attr [extra1,extra2]
The entry name and module name are required, but the :attrs and [extras] parts are optional, as is the whitespace shown between some of the items. The dist argument is passed through to the EntryPoint() constructor, along with the other values parsed from src.
For simple introspection, EntryPoint objects have attributes that correspond exactly to the constructor argument names: name, module_name, attrs, extras, and dist are all available. In addition, the following methods are provided:
Distribution objects represent collections of Python code that may or may not be importable, and may or may not have metadata and resources associated with them. Their metadata may include information such as what other projects the distribution depends on, what entry points the distribution advertises, and so on.
Most commonly, you'll obtain Distribution objects from a WorkingSet or an Environment. (See the sections above on WorkingSet Objects and Environment Objects, which are containers for active distributions and available distributions, respectively.) You can also obtain Distribution objects from one of these high-level APIs:
However, if you're creating specialized tools for working with distributions, or creating a new distribution format, you may also need to create Distribution objects directly, using one of the three constructors below.
These constructors all take an optional metadata argument, which is used to access any resources or metadata associated with the distribution. metadata must be an object that implements the IResourceProvider interface, or None. If it is None, an EmptyProvider is used instead. Distribution objects implement both the IResourceProvider and IMetadataProvider Methods by delegating them to the metadata object.
Ensure distribution is importable on path. If path is None, sys.path is used instead. This ensures that the distribution's location is in the path list, and it also performs any necessary namespace package fixups or declarations. (That is, if the distribution contains namespace packages, this method ensures that they are declared, and that the distribution's contents for those namespace packages are merged with the contents provided by any other active distributions. See the section above on Namespace Package Support for more information.)
pkg_resources adds a notification callback to the global working_set that ensures this method is called whenever a distribution is added to it. Therefore, you should not normally need to explicitly call this method. (Note that this means that namespace packages on sys.path are always imported as soon as pkg_resources is, which is another reason why namespace packages should not contain any code or import statements.)
The following methods are used to access EntryPoint objects advertised by the distribution. See the section above on Entry Points for more detailed information about these operations:
In addition to the above methods, Distribution objects also implement all of the IResourceProvider and IMetadataProvider Methods (which are documented in later sections):
If the distribution was created with a metadata argument, these resource and metadata access methods are all delegated to that metadata provider. Otherwise, they are delegated to an EmptyProvider, so that the distribution will appear to have no resources or metadata. This delegation approach is used so that supporting custom importers or new distribution formats can be done simply by creating an appropriate IResourceProvider implementation; see the section below on Supporting Custom Importers for more details.
The ResourceManager class provides uniform access to package resources, whether those resources exist as files and directories or are compressed in an archive of some kind.
Normally, you do not need to create or explicitly manage ResourceManager instances, as the pkg_resources module creates a global instance for you, and makes most of its methods available as top-level names in the pkg_resources module namespace. So, for example, this code actually calls the resource_string() method of the global ResourceManager:
import pkg_resources my_data = pkg_resources.resource_string(__name__, "foo.dat")
Thus, you can use the APIs below without needing an explicit ResourceManager instance; just import and use them as needed.
In the following methods, the package_or_requirement argument may be either a Python package/module name (e.g. foo.bar) or a Requirement instance. If it is a package or module name, the named module or package must be importable (i.e., be in a distribution or directory on sys.path), and the resource_name argument is interpreted relative to the named package. (Note that if a module name is used, then the resource name is relative to the package immediately containing the named module. Also, you should not use use a namespace package name, because a namespace package can be spread across multiple distributions, and is therefore ambiguous as to which distribution should be searched for the resource.)
If it is a Requirement, then the requirement is automatically resolved (searching the current Environment if necessary) and a matching distribution is added to the WorkingSet and sys.path if one was not already present. (Unless the Requirement can't be satisfied, in which case an exception is raised.) The resource_name argument is then interpreted relative to the root of the identified distribution; i.e. its first path segment will be treated as a peer of the top-level modules or packages in the distribution.
Note that resource names must be /-separated paths and cannot be absolute (i.e. no leading /) or contain relative names like "..". Do not use os.path routines to manipulate resource paths, as they are not filesystem paths.
Note that only resource_exists() and resource_isdir() are insensitive as to the resource type. You cannot use resource_listdir() on a file resource, and you can't use resource_string() or resource_stream() on directory resources. Using an inappropriate method for the resource type may result in an exception or undefined behavior, depending on the platform and distribution format involved.
Sometimes, it is not sufficient to access a resource in string or stream form, and a true filesystem filename is needed. In such cases, you can use this method (or module-level function) to obtain a filename for a resource. If the resource is in an archive distribution (such as a zipped egg), it will be extracted to a cache directory, and the filename within the cache will be returned. If the named resource is a directory, then all resources within that directory (including subdirectories) are also extracted. If the named resource is a C extension or "eager resource" (see the setuptools documentation for details), then all C extensions and eager resources are extracted at the same time.
Archived resources are extracted to a cache location that can be managed by the following two methods:
Set the base path where resources will be extracted to, if needed.
If you do not call this routine before any extractions take place, the path defaults to the return value of get_default_cache(). (Which is based on the PYTHON_EGG_CACHE environment variable, with various platform-specific fallbacks. See that routine's documentation for more details.)
Resources are extracted to subdirectories of this path based upon information given by the resource provider. You may set this to a temporary directory, but then you must call cleanup_resources() to delete the extracted files when done. There is no guarantee that cleanup_resources() will be able to remove all extracted files. (On Windows, for example, you can't unlink .pyd or .dll files that are still in use.)
Note that you may not change the extraction path for a given resource manager once resources have been extracted, unless you first call cleanup_resources().
If you are implementing an IResourceProvider and/or IMetadataProvider for a new distribution archive format, you may need to use the following IResourceManager methods to co-ordinate extraction of resources to the filesystem. If you're not implementing an archive format, however, you have no need to use these methods. Unlike the other methods listed above, they are not available as top-level functions tied to the global ResourceManager; you must therefore have an explicit ResourceManager instance to use them.
Return absolute location in cache for archive_name and names
The parent directory of the resulting path will be created if it does not already exist. archive_name should be the base filename of the enclosing egg (which may not be the name of the enclosing zipfile!), including its ".egg" extension. names, if provided, should be a sequence of path name parts "under" the egg's extraction location.
This method should only be called by resource providers that need to obtain an extraction location, and only for names they intend to extract, as it tracks the generated names for possible cleanup later.
Perform any platform-specific postprocessing of tempname. Resource providers should call this method ONLY after successfully extracting a compressed resource. They must NOT call it on resources that are already in the filesystem.
tempname is the current (temporary) name of the file, and filename is the name it will be renamed to by the caller after this routine returns.
The metadata API is used to access metadata resources bundled in a pluggable distribution. Metadata resources are virtual files or directories containing information about the distribution, such as might be used by an extensible application or framework to connect "plugins". Like other kinds of resources, metadata resource names are /-separated and should not contain .. or begin with a /. You should not use os.path routines to manipulate resource paths.
The metadata API is provided by objects implementing the IMetadataProvider or IResourceProvider interfaces. Distribution objects implement this interface, as do objects returned by the get_provider() function:
If a package name is supplied, return an IResourceProvider for the package. If a Requirement is supplied, resolve it by returning a Distribution from the current working set (searching the current Environment if necessary and adding the newly found Distribution to the working set). If the named package can't be imported, or the Requirement can't be satisfied, an exception is raised.
NOTE: if you use a package name rather than a Requirement, the object you get back may not be a pluggable distribution, depending on the method by which the package was installed. In particular, "development" packages and "single-version externally-managed" packages do not have any way to map from a package name to the corresponding project's metadata. Do not write code that passes a package name to get_provider() and then tries to retrieve project metadata from the returned object. It may appear to work when the named package is in an .egg file or directory, but it will fail in other installation scenarios. If you want project metadata, you need to ask for a project, not a package.
The methods provided by objects (such as Distribution instances) that implement the IMetadataProvider or IResourceProvider interfaces are:
pkg_resources provides a simple exception hierarchy for problems that may occur when processing requests to locate and activate packages:
ResolutionError DistributionNotFound VersionConflict UnknownExtra ExtractionError
A problem occurred extracting a resource to the Python Egg cache. The following attributes are available on instances of this exception:
By default, pkg_resources supports normal filesystem imports, and zipimport importers. If you wish to use the pkg_resources features with other (PEP 302-compatible) importers or module loaders, you may need to register various handlers and support functions using these APIs:
Register distribution_finder to find distributions in sys.path items. importer_type is the type or class of a PEP 302 "Importer" (sys.path item handler), and distribution_finder is a callable that, when passed a path item, the importer instance, and an only flag, yields Distribution instances found under that path item. (The only flag, if true, means the finder should yield only Distribution objects whose location is equal to the path item provided.)
See the source of the pkg_resources.find_on_path function for an example finder function.
Register namespace_handler to declare namespace packages for the given importer_type. importer_type is the type or class of a PEP 302 "importer" (sys.path item handler), and namespace_handler is a callable with a signature like this:
def namespace_handler(importer, path_entry, moduleName, module): # return a path_entry to use for child packages
Namespace handlers are only called if the relevant importer object has already agreed that it can handle the relevant path item. The handler should only return a subpath if the module __path__ does not already contain an equivalent subpath. Otherwise, it should return None.
For an example namespace handler, see the source of the pkg_resources.file_ns_handler function, which is used for both zipfile importing and regular importing.
IResourceProvider is an abstract class that documents what methods are required of objects returned by a provider_factory registered with register_loader_type(). IResourceProvider is a subclass of IMetadataProvider, so objects that implement this interface must also implement all of the IMetadataProvider Methods as well as the methods shown here. The manager argument to the methods below must be an object that supports the full ResourceManager API documented above.
Note, by the way, that your provider classes need not (and should not) subclass IResourceProvider or IMetadataProvider! These classes exist solely for documentation purposes and do not provide any useful implementation code. You may instead wish to subclass one of the built-in resource providers.
pkg_resources includes several provider classes that are automatically used where appropriate. Their inheritance tree looks like this:
NullProvider EggProvider DefaultProvider PathMetadata ZipProvider EggMetadata EmptyProvider FileMetadata
In addition to its high-level APIs, pkg_resources also includes several generally-useful utility routines. These routines are used to implement the high-level APIs, but can also be quite useful by themselves.
Parse a project's version string, returning a value that can be used to compare versions by chronological order. Semantically, the format is a rough cross between distutils' StrictVersion and LooseVersion classes; if you give it versions that would work with StrictVersion, then they will compare the same way. Otherwise, comparisons are more like a "smarter" form of LooseVersion. It is possible to create pathological version coding schemes that will fool this parser, but they should be very rare in practice.
The returned value will be a tuple of strings. Numeric portions of the version are padded to 8 digits so they will compare numerically, but without relying on how numbers compare relative to strings. Dots are dropped, but dashes are retained. Trailing zeros between alpha segments or dashes are suppressed, so that e.g. "2.4.0" is considered the same as "2.4". Alphanumeric parts are lower-cased.
The algorithm assumes that strings like "-" and any alpha string that alphabetically follows "final" represents a "patch level". So, "2.4-1" is assumed to be a branch or patch of "2.4", and therefore "2.4.1" is considered newer than "2.4-1", which in turn is newer than "2.4".
Strings like "a", "b", "c", "alpha", "beta", "candidate" and so on (that come before "final" alphabetically) are assumed to be pre-release versions, so that the version "2.4" is considered newer than "2.4a1". Any "-" characters preceding a pre-release indicator are removed. (In versions of setuptools prior to 0.6a9, "-" characters were not removed, leading to the unintuitive result that "0.2-rc1" was considered a newer version than "0.2".)
Finally, to handle miscellaneous cases, the strings "pre", "preview", and "rc" are treated as if they were "c", i.e. as though they were release candidates, and therefore are not as new as a version string that does not contain them. And the string "dev" is treated as if it were an "@" sign; that is, a version coming before even "a" or "alpha".
Yield non-empty/non-comment lines from a string/unicode or a possibly- nested sequence thereof. If strs is an instance of basestring, it is split into lines, and each non-blank, non-comment line is yielded after stripping leading and trailing whitespace. (Lines whose first non-blank character is # are considered comment lines.)
If strs is not an instance of basestring, it is iterated over, and each item is passed recursively to yield_lines(), so that an arbitarily nested sequence of strings, or sequences of sequences of strings can be flattened out to the lines contained therein. So for example, passing a file object or a list of strings to yield_lines will both work. (Note that between each string in a sequence of strings there is assumed to be an implicit line break, so lines cannot bridge two strings in a sequence.)
This routine is used extensively by pkg_resources to parse metadata and file formats of various kinds, and most other pkg_resources parsing functions that yield multiple values will use it to break up their input. However, this routine is idempotent, so calling yield_lines() on the output of another call to yield_lines() is completely harmless.
Split a string (or possibly-nested iterable thereof), yielding (section, content) pairs found using an .ini-like syntax. Each section is a whitespace-stripped version of the section name ("[section]") and each content is a list of stripped lines excluding blank lines and comment-only lines. If there are any non-blank, non-comment lines before the first section header, they're yielded in a first section of None.
This routine uses yield_lines() as its front end, so you can pass in anything that yield_lines() accepts, such as an open text file, string, or sequence of strings. ValueError is raised if a malformed section header is found (i.e. a line starting with [ but not ending with ]).
Note that this simplistic parser assumes that any line whose first nonblank character is [ is a section heading, so it can't support .ini format variations that allow [ as the first nonblank character on other lines.