The PEAK Developers' Center   Diff for "PythonEggs" UserPreferences
 
HelpContents Search Diffs Info Edit Subscribe XML Print View
Ignore changes in the amount of whitespace

Differences between version dated 2005-01-28 16:25:09 and 2008-11-15 17:51:33 (spanning 47 versions)

Deletions are marked like this.
Additions are marked like this.

#format rst
Package Resource API
--------------------
 
The following API routines will be available in the ``package_resources`` module as module-level functions, and as methods of the ``ResourceManager`` class. An application may create its own ``ResourceManager`` instance(s), or use the default global instance by calling the module-level versions of these routines.
==============================
The Quick Guide to Python Eggs
==============================
 
**NOTE: If all you want to do is install a project distributed as an .egg file**, head straight to the `Easy Install page <EasyInstall>`_. EasyInstall makes installing Python code as easy as typing ``easy_install SomeProjectName``. You don't need to read this page unless you want to know more about how eggs themselves work.
 
.. contents:: **Table of Contents**
 
 
Overview
--------
 
    "Eggs are to Pythons as Jars are to Java..."
 
Python eggs are a way of bundling additional information with a Python project, that allows the project's dependencies to be checked and satisfied at runtime, as well as allowing projects to provide plugins for other
projects. There are several binary formats that embody eggs, but the most common is '.egg' zipfile
format, because it's a convenient one for *distributing* projects. All of the formats support including package-specific data, project-wide metadata, C extensions, and Python code.
 
The easiest way to install and use Python eggs is to use the `"Easy Install" <EasyInstall>`_ Python package manager, which will find, download, build, and install eggs for you; all you do is tell it the name (and optionally, version) of the Python project(s) you want to use.
 
Python eggs can be used with Python 2.3 and up, and can be built using the `setuptools <http://peak.telecommunity.com/DevCenter/setuptools>`_ package (see the `Python Subversion sandbox <http://svn.python.org/projects/sandbox/trunk/setuptools/>`_ for source code, or the `EasyInstall page <EasyInstall#installing-easy-install>`_ for current installation instructions).
 
The primary benefits of Python Eggs are:
 
* They enable tools like the `"Easy Install" <EasyInstall>`_ Python package manager
 
* .egg files are a "zero installation" format for a Python package; no build or install step is required, just put them on ``PYTHONPATH`` or ``sys.path`` and use them (may require the runtime installed if C extensions or data files are used)
 
* They can include package metadata, such as the other eggs they depend on
 
* They allow "namespace packages" (packages that just contain other packages) to be split into separate distributions (e.g. zope.*, twisted.*, peak.* packages can be distributed as separate eggs, unlike normal packages which must always be placed under the same parent directory. This allows what are now huge monolithic packages to be distributed as separate components.)
 
* They allow applications or libraries to specify the needed version of a library, so that you can e.g. ``require("Twisted-Internet>=2.0")`` before doing an ``import twisted.internet``.
 
* They're a great format for distributing extensions or plugins to extensible applications and frameworks (such as `Trac <http://www.edgewall.com/trac/>`_, which uses eggs for plugins as of 0.9b1), because the egg runtime provides simple APIs to locate eggs and find their advertised entry points (similar to Eclipse's "extension point" concept).
 
There are also other benefits that may come from having a standardized format, similar to the benefits of Java's "jar" format.
 
 
Using Eggs
----------
 
If you have a pure-Python .egg file that doesn't use any in-package data files, and you don't mind manually placing it on ``sys.path`` or ``PYTHONPATH``, you can use the egg without installing ``setuptools``. For eggs containing C extensions, however, or those that need access to non-Python data files contained in the egg, you'll need the ``pkg_resources`` module from ``setuptools`` installed. For installation instructions, see the `EasyInstall page`_.
 
In addition to providing runtime support for using eggs containing C extensions or data files, the ``pkg_resources`` module also provides an API for automatically locating eggs and their dependencies and adding them to ``sys.path`` at runtime. (See the `API documentation <http://peak.telecommunity.com/DevCenter/PkgResources>`_ and `setuptools documentation <http://peak.telecommunity.com/DevCenter/setuptools>`_ for details.)
 
With this support, you can install and keep multiple versions of the same package on your system, with the right version automatically being selected at runtime. Plus, if an egg has a dependency that can't be met, the runtime will raise a ``DistributionNotFound`` error that says what package and version is needed.
 
By the way, in case you're wondering how you can tell a "pure" (all-Python) egg from one with C extensions, the difference is that eggs containing C extensions will have their target platform's name at the end of the filename, just before the ``.egg``. "Pure" eggs are (in principle) platform-indepenent, and have no platform name. If you're using the ``pkg_resources`` runtime to find eggs for you, it will ignore any eggs that it can tell are not usable on your platform or Python version. If you're not using the runtime, you'll have to make sure that you use only compatible eggs.
 
Once you have the runtime installed, you need to get your desired egg(s) on to ``sys.path``. You can do this manually, by placing them in the ``PYTHONPATH`` environment variable, or you can add them directly to ``sys.path`` in code. This approach doesn't scale well, however, because as you need additional eggs, you'll be managing a longer and longer ``PYTHONPATH`` or ``sys.path`` by hand. Not only that, but you'll have to manually keep track of all the eggs needed by the eggs you're using! Luckily, there is a better way to do it.
 
 
Automatic Discovery
...................
 
The better way to manage your eggs is to place them in a directory that's already on ``sys.path``, such as ``site-packages``, or the directory that your application's main script is in, or a directory that you'll be adding to ``PYTHONPATH`` or ``sys.path``. Then, before attempting to import from any eggs, use a snippet of code like this::
 
    from pkg_resources import require
    require("FooBar>=1.2")
 
This will search all ``sys.path`` directories for an egg named "FooBar" whose release version is 1.2 or higher, and it will automatically add the newest matching version to ``sys.path`` for you, along with any eggs that the FooBar egg needs. (A note about versions: the egg runtime system understands typical version numbering schemes, so it knows that versions like "1.2a1" and "1.2rc5" are actually *older* than the plain version "1.2", but it also knows that versions like "1.2p1" or "1.2-1" are *newer* than "1.2".)
 
You can specify more than one requirement when calling ``require()``, and you can also specify more complex version requirements, like ``require("FooBar>=1.2", "Thingy>1.0,!=1.5,<2.0a3,==2.1,>=2.3")``. Requirement strings basically consist of a distribution name, an optional list of "options" (more on this in a moment), and a comma-separated list of zero or more version conditions. Version conditions basically specify ranges of valid versions, using comparison operators. The version conditions you supply are sorted into ascending version order, and then scanned left to right until the package's version falls between a pair of ``>`` or ``>=`` and ``<`` or ``<=`` conditions, or exactly matches a ``==`` or ``!=`` condition.
 
Note, by the way, that it's perfectly valid to have no version conditions; if you can use any version of "FooBar", for example, you can just ``require("FooBar")``. Distribution names are also case-insensitive, so ``require("foobar")`` would also work, but for clarity's sake we recommend using the same spelling as the package's author.
 
Some eggs may also offer "extras" - optional features that, if used, will need other eggs to be located and added to ``sys.path``. You can specify zero or more options that you wish to use, by placing a comma-separated list in square brackets just after the requested distribution name. For example, the "FooBarWeb" web framework might offer optional FastCGI support. When you ``require("FooBarWeb[FastCGI]>=1.0")``, the additional eggs needed to support the FastCGI option will also be added to ``sys.path``. (Or, if one of them isn't found, a ``pkg_resources.DistributionNotFound`` error will be raised, identifying what dependency couldn't be satisfied.)
 
iter_distributions(name=None,path=None)
    Searching the list of locations specified by `path`, yielding distributions whose names match `name`, if specified. If `path` is ``None``, the resource manager's default path is searched. If the resource manager has no default path, ``sys.path`` is searched. If `name` is ``None``, all recognized distributions are yielded. Distribution objects yielded by this routine may be added to ``sys.metapath`` in order to make them accessible for importing, as they are PEP 302-compatible "importer" objects.
To find out what options an egg offers, you should consult its documentation, or unpack and read its ``EGG-INFO/depends.txt`` file, which lists an egg's required and optional dependencies.
 
resource_string(package_name,resource_name)
    Return the named resource as a binary string.
(Note: the ``pkg_resources`` module does *not* automatically look for eggs on PyPI or download them from anywhere; any needed eggs must already be available in a directory on ``sys.path``, or ``require()`` will raise a ``DependencyNotFound`` error. You can of course trap this error in your code and attempt to find the needed eggs on PyPI or elsewhere. If you want to automatically install dependencies for a project you're working on, you should probably build it using `setuptools <http://peak.telecommunity.com/DevCenter/setuptools>`_, which lets you declare dependencies where they can be found by tools like `EasyInstall <http://peak.telecommunity.com/DevCenter/EasyInstall>`_. Setuptools is also needed in order to build eggs.)
 
resource_stream(package_name,resource_name,mode='b')
    Open the named resource as a file-like object, using the specified mode ('t', 'b', or 'U'). (Note that this does not necessarily return an actual file; if you need a ``fileno()`` or an actual operating system file, you should use ``resource_filename()`` instead.)
 
resource_filename(package_name,resource_name)
    Return a platform file or directory name for the named resource. If the package is in an egg distribution, the resource will be unpacked before the filename is returned. If the named resource is a directory, the entire directory's contents will be extracted before the directory name is returned. Also, if the named resource is an "eager" resource such as a Python extension or shared library, then all "eager" resources will be extracted before the resource's filename is returned. (This is to ensure that shared libraries that link to other included libraries will have their dependencies available before loading.)
Building Eggs
-------------
 
set_extraction_path(path)
    Set the base path where resources will be extracted to. If not set, this defaults to ``os.expanduser("~/.python-eggs")``. Resources are extracted to subdirectories of this path, named for the corresponding .egg file. You may set this to a temporary directory, but then you must call ``cleanup_resources()`` to delete the extracted files when done. (Note: you may not change the extraction path for a given resource manager once resources have been extracted, unless you first call ``cleanup_resources()``.)
To build an egg from a package's ``setup.py``, you'll need to have ``setuptools`` installed. If you haven't already installed it in order to use the ``pkg_resources`` runtime, just check it out of Python's Subversion sandbox and run ``setup.py install`` to install it, or see the `EasyInstall page's installation instructions <EasyInstall#installing-easy-install>`_ (). Now you're ready to build eggs.
 
cleanup_resources(force=False)
    Delete all extracted resource files and directories, returning a list of the file and directory names that could not be successfully removed. This function does not have any concurrency protection, so it should generally only be called when the extraction path is a temporary directory exclusive to a single process. This method is *not* automatically called; you *must* call it explicitly or register it as an ``atexit`` function if you wish to ensure cleanup of a temporary directory used for extractions.
Edit the target package's ``setup.py`` and add ``from setuptools import setup`` such that it replaces the existing import of the ``setup`` function. Then run ``setup.py bdist_egg``.
 
ResourceManager(path=None)
    Create a resource manager instance, using `path` as its default path.
That's it. A ``.egg`` file will be deposited in the ``dist`` directory, ready for use. If you want to add any special metadata files, you can do so in the ``SomePackage.egg-info`` directory that ``bdist_egg`` creates. ("SomePackage" will of course be replacd by the name of the package you're building.) Any files placed in this directory are copied to an ``EGG-INFO`` directory within the egg file, for use at runtime. Other metadata files are automatically generated for you, so don't edit them, as the next time you run a setup command they may be overwritten with the automatically generated versions.
 
Note: packages that expect to find data files in their package directories, but which do not use either the PEP 302 API or the ``pkg_resources`` API to find them will *not* work when packaged as .egg files. One way you can check for this is if the .egg file contains data files, and the package is using ``__file__`` to find them. You'll need to then patch the package so it uses ``pkg_resources.resource_filename()`` or one of the other ``resource_*`` APIs instead of ``__file__``. See the section on `Accessing Package Resources`_, below, for more information about updating packages to use the resource management API instead of ``__file__`` manipulation.
 
Distribution Objects
 
Declaring Dependencies
......................
 
Some eggs need other eggs to function. However, there isn't always a meaningful place for a library to call ``require()``, and in any case a library's source code is rarely the place to declare its version dependencies. So ``setuptools`` allows you to declare dependencies in your project's setup script, so that they will be bundled inside the egg's metadata directory, and both the runtime and EasyInstall can then automatically find the additional eggs needed, adding them to ``sys.path`` when your project is installed or requested at runtime via ``require()``. (Note: the EasyInstall program will find and download dependencies from the internet automatically, but for security reasons simply using ``require()`` in Python code does not do this. ``require()`` only locates eggs that are in directories on the local machine that are listed in ``sys.path``)
 
For more information on declaring your project's dependencies, see the `setuptools documentation`_.
 
 
Developing with Eggs
--------------------
 
These need name, version, python version, absolute path, and a metadata API, as well as PEP 302 "importer" methods.
Here are a few quick tips and techniques about developing software using eggs' features. These are just overviews, though, and you should dig into the complete `setuptools documentation`_ and `API documentation`_ manuals if there's not enough information here.
 
 
Running Eggs from Source
........................
 
So far, we've only covered how to use eggs that have actually been installed, by building them with the distutils and then putting them in a directory on ``sys.path``. (Note: `EasyInstall`_ can download source distributions, automatically build eggs from them, and install the eggs for you, with just a single command -- even if the package's author did nothing special to support Python Eggs. Check it out.)
 
But what if you are developing a package and working from source code? You don't want to have to rebuild the egg every time you make a change to the source code. But, you have code in your script or application that calls ``require()`` and expects the egg you're developing to be available. For example, see this `question from Ian Bicking <http://mail.python.org/pipermail/distutils-sig/2005-June/004576.html>`_ about working with packages checked out from subversion, but not built as eggs.
 
If you're using `setuptools`_, the answer is simple: run "setup.py develop" to create a "source egg" - a special link to your project's source directory, combined with wrappers for your source scripts. See the `setuptools documentation`_ under "Develoment Mode" and also the "develop" commmand reference for more details.
 
 
Accessing Package Resources
...........................
 
Many modern Python packages depend on "resources" (data files) that are included with the package, typically placed within the package's subdirectory in a normal installation. Usually, such packages manipulate their modules' ``__file__`` or ``__path__`` attributes in order to locate and read these resources. For example, suppose that a module needs to access a "foo.config" file that's in its package directory. It might do something like::
 
    foo_config = open(os.path.join(os.path.dirname(__file__),'foo.conf').read()
 
However, when code like this is packed inside a zipfile, it can no longer assume that ``__file__`` or ``__path__`` contain filenames or directory names, and so it will fail.
 
Packages that access resource files, and want to be usable inside a zipfile (such as a ``.egg`` file), then, must use the `PEP 302 <http://www.python.org/peps/pep-0302.html>`_ ``get_data()`` extension (see under "Optional Extensions to the Importer Protocol") before falling back to direct ``__file__`` access.
 
Using this protocol can be complex, however, so the egg runtime system offers a convenient resource management API as an alternative. Here's our "foo_config" example, rewritten to use the ``pkg_resources`` API::
 
    from pkg_resources import resource_string
    foo_config = resource_string(__name__, 'foo.conf')
 
Instead of manipulating ``__file__``, you simply pass a module name or package name to ``resource_string``, ``resource_stream``, or ``resource_filename``, along with the name of the resource. Normally, you should try to use ``resource_string`` or ``resource_stream``, unless you are interfacing with code you don't control (especially C code) that absolutely must have a filename. The reason is that if you ask for a filename, and your package is packed into a zipfile, then the resource must be extracted to a temporary directory, which is a more costly operation than just returning a string or file-like object.
 
Note, by the way, that if your resources include subdirectories of their own, you must specify resource names using '/' as a path separator. The resource API will replace slashes with a platform-appropriate filename, if in fact filenames are being used (as opposed to e.g. zipfile contents). For more examples and information, see the `setuptools documentation`_ and the `API documentation`_ for ``pkg_resources``.
 
 
 
Implementation Status
---------------------
 
The runtime implementation has been stable for some time now, and the EasyInstall package manager is now close to beta quality. The runtime still has a couple of minor issues, which should probably be in the official documentation:
 
 * The extract process treats the file's timestamp in the zipfile as "local" time with "unknown" DST. It's theoretically possible that a DST change could cause the system to think that the file timestamp no longer matches the zip timestamp. Also, the resulting Unix-style timestamp for the extracted file may differ between systems with different timezones. This is an unfortunate side effect of the fact that the zip file format does not include timezone information or a UTC timestamp.
 
 * Cleanup on Windows doesn't work, because the .pyd's remain in use as long as some Python process using them is still running. An application that really wants to clean up on exit can presumably spawn another process to do something about it, but that kind of sucks, and doesn't account for the fact that another process might still be using the file anyway. (Michael Dubner suggested using ``HKEY_LOCAL_MACHINE\\SYSTEM\\CurrentControlSet\\Control\\Session Manager\\FileRenameOperations`` to fix the Windows problem, but unfortunately didn't give any details or sample code.) Sample code at http://aspn.activestate.com/ASPN/docs/ActivePython/2.2/PyWin32/win32api__MoveFileEx_meth.html, but in practice we really can't do this without requiring the PyWin32 extensions, which isn't such a great idea. So, the simple solution is to just avoid doing cleanups, and instead stick with a persistent cache directory.
 
 
 
Questions
---------
 
What's the difference between Python Eggs and Zero Install (http://0install.net)?
 
A: Zero Install is a Unix only software package installer that does not work on Windows.

PythonPowered
ShowText of this page
EditText of this page
FindPage by browsing, title search , text search or an index
Or try one of these actions: AttachFile, DeletePage, LikePages, LocalSiteMap, SpellCheck