[00:55:41] ** rdmurray has left IRC ("User disconnected") [01:28:13] ** sprout has joined us [02:24:38] ** vlado__ has left IRC ("Leaving") [02:53:44] ** sprout_ has joined us [02:53:44] ** sprout has left IRC (Read error: 104 (Connection reset by peer)) [03:18:42] ** Maniac_ has joined us [03:19:45] ** Maniac has left IRC (Read error: 54 (Connection reset by peer)) [03:38:20] ** sprout_ has left IRC ("Snak 4.13 IRC For Mac - http://www.snak.com") [03:38:33] bear is now known as bear_afk [04:02:28] [connected at Fri Jan 28 04:02:28 2005] [04:02:28] <> *** Looking up your hostname... [04:02:29] <> *** Checking ident [04:02:29] <> *** Found your hostname [04:02:59] <> *** No identd (auth) response [04:03:00] <> *** Your host is sterling.freenode.net[freebsd.widexs.nl/6667], running version dancer-ircd-1.0.35 [04:03:00] [I have joined #peak] [04:03:00] ** sterling.freenode.net set the topic to http://dirtsimple.org/2004/11/generic-functions-have-landed.html [04:07:41] ** apoirier has joined us [06:20:20] ** gbay has joined us [08:50:47] ** gbay has left IRC (Read error: 104 (Connection reset by peer)) [08:58:18] ** rdmurray has joined us [09:05:15] ** bear_afk has left IRC (Read error: 110 (Connection timed out)) [09:55:14] ** [apoirier] has joined us [10:13:37] ** apoirier has left IRC (Read error: 110 (Connection timed out)) [12:10:18] ** bear has joined us [12:55:42] ** pje has joined us [12:55:50] Hola [12:56:41] 'lo [13:00:50] * pje has been thinking through a lot of the plugin stuff [13:01:03] I did find a few holes, but mostly it's pretty solid [13:01:19] E.g., dealing with concurrent apps extracting plugin contents to the same cache [13:01:44] Probably have to extract things to temporary names and then move them. [13:02:06] But on Windows you have to unlink before you move the new file [13:02:16] But you can't unlink an open file [13:02:31] Ugh. [13:02:32] So, Windows' logic is going to be... interesting. :) [13:02:45] I just registered for pycon. [13:02:47] :) [13:02:54] First time I'll be going. [13:03:06] Me too. I've been to 2 IPC's, but no Pycons. [13:03:20] IPC = International Python Convention [13:03:55] Are you going to do a PEAK bof or anything? [13:04:16] Don't have anything planned, but certainly open to it. [13:04:52] I guess we could see on the list how many people are interested (and going) [13:05:01] * arg raises hand [13:05:03] :) [13:12:22] ** sprout has joined us [13:13:12] * pje waves to Ted [13:13:43] Ted, you never gave us any whines yesterday. :( [13:13:48] ;) [13:14:33] sorry, I got caught up trying to integrate my stuff into chandler [13:14:38] I'll try to do better today [13:14:43] ;-) [13:14:45] Just teasing. [13:14:55] np [13:15:07] I'm pretty excited about what Bear, Bob, and I managed to rough out though. [13:15:39] that's good to hear. [13:16:13] A rough equivalent of .jar for Python, complete with support for resources and native libs. [13:16:27] It doesn't sound so impressive to put it that way, but compared to what Python has now, it's a lot. [13:16:33] sounds like exactly what the doctor ordered. [13:16:41] i totally agree [13:17:19] Coupled with an API to add jars to sys.path, I guess. And to control extract locations for resources that have to be true files (like native libs) [13:17:22] i'm glad that you're willing to take good ideas from java into python ;-) [13:17:40] Didn't you read my "Java is not Python, either" piece? :) [13:17:46] ;-) [13:18:31] how about getting rid of the GIL, while you're at it? [13:18:48] It's been done; it made Python slower. [13:19:10] hmm [13:19:30] See http://mail.python.org/pipermail/python-dev/2001-August/017099.html [13:20:47] nuts. Greg generally knows whereof he speaks, too. [13:21:34] Aside from the slowdown, he makes it clear that lock contention limited multiprocessor gains significantly. [13:21:50] yep, i see that [13:22:54] * pje doesn't worry about the GIL [13:23:12] I prefer to use async+multiprocess anyway. [13:23:51] Or deterministically-scheduled pseudothreads, as in peak.events. [13:24:24] i :> peak.events [13:25:08] What are you using it for, Arg? [13:25:11] i've been burned too many times by subtle threading bugs [13:25:32] pje : an icmp/http throughput tester [13:25:36] for work [13:26:04] Cool. [13:26:18] Are you using it with Twisted for the actual networking, or did you write your own? [13:26:59] i ended up using my own, very stripped down thing [13:27:06] looked a lot like twisted [13:27:13] * pje nods [13:27:34] You know events.Tasks can yield to twisted Deferreds, right? [13:27:36] but yield anEvent is so nice an explicit [13:28:13] yep, thats an awesome feature, im surprised with all the ooh-ahhing about Seaside and continuation-based web stuff, that peak.events hasnt gotten more attention [13:28:31] Well, it doesn't integrate with peak.web, for one thing. :) [13:28:40] oh, doh [13:29:08] You *could* integrate it, of course. [13:29:14] havent yet used peak.web [13:29:29] I don't think anybody has. :) [13:29:53] * sprout feels less bad that he is not up on peak.web [13:30:03] i plan to, for a project that keeps getting kicked down the road [13:30:25] Hm, Bob hasn't said anything yet, he must be catching up on work missed yesterday designing the plugin system. :) [13:30:54] Actually, Radek is using it, or at least he keeps sending patches. :) [13:31:22] he's prolific [13:31:23] Seriously, I think he's evaluating it for addition to his company's existing PEAK-based server product. [13:32:02] They need to add a web UI to it, so he's looking at web toolkits. I'll be curious to see if they decide to go ahead with using it. [13:32:57] I think he's also the person who's done the most with generic functions too. [13:33:50] Anyway, I think probably the best next step on the plugin system is probably to write up a pre-PEP for the "jar-alike" format and resource API we sketched yesterday. [13:33:53] i have the C++ guy sitting next to me about ->| |<- that close to switching to python [13:34:05] when i regale him with tails of generic functions [13:34:09] tales even [13:34:33] pje, has anyone tried to buiid your gf speedups on a Mac? [13:34:46] Dunno, why? Are they broken there? [13:35:01] no, I haven't gotten around to trying yet, is all. [13:35:08] i can try right now [13:35:16] I'm definitely interested in playing with them. [13:35:24] Ah. Well, Bob Ippolito uses PyProtocols, so I know the base speedups work on Mac. [13:35:38] The speedups for generic functions are relatively new and few. [13:35:38] it works :) [13:35:44] with the usual pyrex warning spew [13:35:47] That was fast. [13:36:33] Anyway, PyProtocols is designed to make the C speedups optional; if you 'setup.py --without-speedups install' you get Python-only versions. [13:36:58] i wish people would stop using zope.interface [13:37:13] Why? [13:37:14] more specifically, i wish twisted would use pyprotocols [13:37:39] because then dispatch would be everywhere [13:37:51] Ah. [13:40:25] my ghod its cold out there, cursed new england winters [13:42:08] Btw, in December I wrote this: http://peak.telecommunity.com/DevCenter/PythonPlugins [13:42:16] as an initial problem statement for jar-like Python stuff. [13:42:25] I'm figuring I'll add our stuff from yesterday to that page [13:42:32] Or maybe just start a new page linked from that. [13:43:09] etrepum, I'm figuring I'll put you down as a PEP co-author, if that's alright. [13:43:35] that's alright with me [13:43:44] He lives! :) [13:44:04] you didn't mention the nick, IRC didn't alert me [13:44:23] Yeah, I thought that might be the problem, which is why I didn't say "Bob" there. :) [13:44:33] ** bear_ has joined us [13:45:18] I only found one hole in our ideas from yesterday; you should find it somewhere near the top of your scrollback. :) [13:46:35] It's only a Windows issue, though. [13:46:41] ah the concurrent cache [13:46:47] Yeah. [13:46:53] that could be an issue [13:46:59] is there a way to get a lock on the cache dir? [13:47:09] and just make the concurrent process wait? [13:47:14] Sure, we could even be windows-specific for that. [13:47:22] In fact, that's a really good idea. [13:47:36] We can crib the win32 lock stuff from peak.running.lockfiles [13:49:39] I guess we could: 1) check size+mtime, 2) if no match, acquire lock 3) check again, 4) if no match, update file 5) release lock [13:49:57] that sounds good [13:50:39] We don't need it on posix platforms because we can just write to a temporary file and move/overwrite. [13:50:53] On Windows, however, we can still fail to update if the file is open. [13:51:08] do dll's stay open? [13:51:13] I think so. [13:51:23] so then if you update with a process running then you are sunk [13:51:26] Yes, definitely now that I think of it. [13:51:35] But that's a development-only scenario. [13:51:44] does windows have hard links that are "good enough"? [13:51:47] Remember, a plugin file is a specific *version* [13:51:51] cause you could have a dir by process id [13:51:53] yeah [13:52:19] I don't think we should worry about that scenario, since you have to update the .zip while you're using it in order to cause such breakage. [13:53:04] yeah [13:53:11] just document as a "don't do that" [13:53:53] So that's the Windows gotcha; we now have one gotcha for every platform. :) [13:54:11] Which I guess means it's officially cross-platform now. :) [13:54:28] haha yeah [13:55:25] Do you think we could fit the runtime API into a single file? I'm just wondering whether we could throw it in the distro. [13:55:46] I guess people could always do that manually, if they wanted to. [13:56:03] IOW, if a library depends on the runtime but can't guarantee the target platform will have it. [13:56:39] well plenty of libraries have dependencies on other libraries [13:56:53] I'd rather not have N versions of the runtime sitting on a person's sys.modules [13:57:13] I guess we could distribute the runtime itself as a plugin too. :) [13:57:28] You'd just have to manually put it on the path. [13:57:41] Of course, if it's just one file it's easy to bundle with the application. [13:58:06] or just one package that doesn't do wonky module stuff, so that py2exe/py2app/etc. can wrap it up nicely [13:58:32] wonky module stuff? [13:58:48] __import__ or imp.* [13:58:59] Ah. [13:59:05] How big is the header rewrite support? [13:59:58] four files maybe? [14:00:19] three, not including __init__ [14:00:24] * pje nods. A package it is, then. [14:00:36] Btw, I'm not sure about the whole .pyplugin name [14:00:42] one is a sane replacement for struct calls, the other is the translation of a C header to Python using this module, and the other is the part that does work [14:00:47] What about .pyzip or .pyarc? [14:01:00] I'm fine with anything as long as it's not already taken [14:01:14] like PAR is a terrible idea [14:01:18] pyzip or pyarc is fine with me [14:01:26] For .pyarc, the top Google hit is me suggesting it as a name for this. :) [14:01:48] For .pyzip, I'm second after some random garbage in a .PDF. :) [14:02:16] either way the search results are statistically insignificant [14:02:27] Which is good, right? [14:02:33] yeah [14:02:44] one minus is that both .com domains are taken [14:03:40] Anyway, given that this concept maps so well to Java .jars, it might be helpful in reaching a larger audience not to constrain our message to plugins. [14:04:00] yeah I agree [14:04:02] Since it's also useful for libraries and applications. [14:04:50] Of pyarc and pyzip, I'm somewhat more inclined to pyzip as being more descriptive. [14:04:53] But only slightly. [14:05:01] another minus about pyzip is that it sounds like zip support for python, or a python implementation of zip [14:05:08] Hm. [14:05:14] Okay, you're tipped me back the other way. [14:05:28] pyzip describes an implementation detail, nothing more [14:05:43] True, they're not called jzips. :) [14:05:55] Or .jaz. :) [14:06:17] So you think we should go with pyarc? [14:06:25] least worst so far [14:07:31] As far as the API goes, I'm wondering whether we should have Archive objects [14:07:37] That is, to get the resources from. [14:07:42] it's not a very marketable name [14:07:50] Eh? [14:07:52] Which? [14:07:55] pyarc [14:07:58] Oh. [14:08:09] Do you have another suggestion? [14:08:28] Hm. pyegg? :) [14:08:33] pysqueeze [14:08:47] foo ? [14:09:02] egg isn't too bad [14:09:10] Python eggs, in other words. :) [14:09:24] A self-contained extractable unit that may have some dependencies. :) [14:09:32] Heh. [14:09:46] "Egg" of course is a google nightmare. [14:09:49] yeah [14:09:59] but so is bean, and java got away with that [14:10:31] Of course, "python egg" is going to pull up other things, too. :) [14:10:38] yeah, lots of them apparently [14:10:42] Then again, I suppose so does "java bean" [14:11:44] Weird, one of the first Python Egg links mentions Darwin, too. :) [14:11:51] haha [14:13:06] what about this archive thing, what would it be for? [14:13:35] Oh, I was thinking that for the stuff where you want to search for a specific version. [14:13:53] That you might have something like 'findArchive(name,versionInfo)' [14:14:01] And get back an Archive that refers to it. [14:14:09] That you can then load resources from, for example. [14:14:10] to find the plugin but not load it? [14:14:20] That too. [14:14:28] Get dependencies out of it and such. [14:14:43] why would you want to do that and not just load it and ask? [14:14:45] So that an application can pre-scan plugins and sort out dependencies. [14:15:01] (for resources) [14:15:05] for metadata, it makes sense [14:15:14] Metadata is a resource, though. [14:15:26] You just don't necessarily need to extract it to a "real" file. [14:15:32] yeah but that's sort of an implementation detail [14:15:37] Sure. [14:15:54] But a useful one for the API itself. [14:16:09] yeah of course [14:16:16] It's possible that we could use the Archive as part of the PEP 302 loader setup. [14:16:45] Or actually, just add the Archive objects to sys.metapath [14:16:58] Then there isn't any crap on sys.path for them. [14:17:12] that makes sense [14:17:15] I guess we should call them Eggs instead of archives, though. :) [14:17:39] yeah searching for python archive is going to bring up lots of mailing list and newsgroup archives [14:18:56] It seems there are three existing uses of .egg as a file extension... [14:19:05] "EGG Solution 360RealTour" [14:19:11] "Wer Wird Millionaer Data Screens File" [14:19:17] "Ducks Add-on/Level File (Hungry Software)" [14:19:18] pyegg could be it though [14:19:28] however those don't sound important [14:19:50] Nope. [14:20:14] I kind of like the ring of bdist_egg. :) [14:20:20] hehe [14:20:30] And you can have library eggs, plugin eggs, application eggs. [14:20:37] egg can have a cool document icon [14:20:38] And eggsecutables, heh. :) [14:20:48] lol [14:20:49] think of the app names: poached, hard/soft, fried, over-easy .... [14:20:56] Heh. [14:21:04] eggroll is the packager [14:21:10] LOL [14:21:13] hehehe [14:21:42] Yeah, I kept thinking before that I wanted something that sounded like a Python jar... [14:22:01] But although you might keep your Java in a Jar, what do you keep a python in? [14:22:11] But terrarium was too long. :) [14:22:24] a nest [14:22:27] well you keep vipers in a pit I guess [14:23:18] Anyway, I guess with this scheme the default cache directory should be ~/.python-eggs [14:23:40] works for me [14:23:52] I really like egg because it's a one-syllable word. [14:23:58] ~/.egg-carton [14:24:02] "Can you send me an egg for that library?" [14:24:05] LOL [14:24:16] "Make sure you have the egg on your path" [14:24:28] Those phrases read well. [14:24:32] can you hard-boil that egg - i.e. encrypt it [14:24:42] Scrambled, you mean. :) [14:24:46] ! [14:24:48] good one [14:24:49] pyzip, pyplugin, all that stuff is kind of awkward to say. [14:24:57] totally [14:25:34] OTOH for the API itself we should probably avoid getting too cutesy. ~/.python-eggs is kind of close to that line. :) [14:26:07] But it's defensible compared to say ~/.baby-pythons (i.e. hatched eggs) :) [14:26:24] yeah [14:27:16] On the minus side also, it sounds odd to talk about building or installing eggs. :) [14:28:15] (Hm, if you distribute one with a bug, do you have egg on your face?) [14:28:23] * bear_ groans [14:28:28] * pje wonders if the Java designers had as much fun coming up with .jar? [14:28:57] well they probably didn't do it on IRC [14:29:24] Okay, so back to the API I guess. :) [14:29:43] find_egg(name, version_info, path=sys.path) [14:30:00] * pje resists the urge to make easter egg hunt jokes [14:30:10] should version_info be optional too? [14:30:16] should it be allowed to return more than one egg? [14:30:18] Yeah, probably. [14:30:23] To both questions. [14:30:38] Hm. [14:30:43] find_distribution, not find_egg [14:30:51] Egg is implementation, not to mention cutesy. [14:31:20] why not just find() - there are only distributions in egg files right? [14:31:48] This is a module-level global, not a method [14:31:55] The idea is [14:32:01] hunt_eggs(name, version_info=None, path=sys.path) -> [eggWrapper, ...] [14:32:11] from egg_runtime import find_distribution [14:32:13] where eggWrappers have a __cmp__ so that they compare (name, version) [14:32:29] so you can sorted(...) that to get the most recent egg [14:32:38] -1 on the __cmp__; too many funky things happen with that. Compare .version expliticly. [14:32:44] Er, explicitly, even. [14:32:54] then maybe it shoudl return [(name, version, eggWrapper), ...] [14:33:15] * pje is tempted to say eggshell instead of wrapper :) [14:33:33] Also, I don't think 'egg_runtime' is the best module name either. [14:34:11] Oh, we need one other parameter... whether versioning is strict or loose. [14:34:25] You familiar with distutils strict vs. loose version formats? [14:34:40] yeah I am [14:34:51] I generally just use loose [14:34:51] I think that's a per-distribution variable. [14:35:08] Yeah, but loose breaks with alpha/beta/candidate stuff. [14:35:16] true [14:35:23] so we could say everyone needs to use a strict version identifier [14:35:37] Yeah, but there are lots of non-strict packages now, like wx. [14:35:51] They have like a four-digit version identifier. [14:36:12] that StrictVersion won't parse? [14:36:16] Plus we really want to be able to put any existing Python distro into an egg. [14:36:42] I thought StrictVersion was a.b.c only [14:37:09] Hm. Okay, let's think of the failure cases for each of strict and loose, to see what's a better default. [14:37:30] Loose will falsely think a final version is older than an alpha/beta/candidate, which is bad. [14:37:51] Strict will fail to parse if it encounters something loose. [14:38:29] Hm. Maybe if we search the path and all encountered egg names can be parsed strictly, use strict [14:38:33] Otherwise loose. [14:38:44] That would have to include the requested version, of course. [14:39:45] What do you think? [14:40:51] makes sense, how do StrictVersion and LooseVersion compare to each other though? [14:41:32] I guess it's rather safe to assume that a particular package consistently uses one or the other [14:42:03] Right. [14:42:11] Of course, a package might switch. [14:42:29] in which case it should change its package name? :) [14:42:36] Good idea. :) [14:42:50] ** rdmurray has left IRC ("User disconnected") [14:43:00] Ultimately, I think version_info might need to be able to handle strings like ">=2.0; <3.0" [14:44:00] But we probably shouldn't mess with it for 1.0 [14:44:10] should probably end up in a different API [14:44:18] maybe this shouldn't accept a version spec at all [14:44:25] Yeah, I was just thinking that. [14:44:30] when you specify a version spec you want one return value [14:44:34] one or zero anyway [14:44:47] but the runtime needs an API that returns all that match by name [14:44:56] in order to implement the filtering version [14:45:03] Name and python version, actually. [14:45:08] true [14:45:14] but what if Python version doesn't matter? [14:45:21] I can't think of a use case for looking at eggs from another species of Python. :) [14:45:38] Only way it could not matter is if it's source-only. [14:45:38] I guess we can say that eggs must be specific to a major version cause they will have pyc's [14:46:44] yeah [14:48:31] Hm. [14:48:58] I guess the resource API can be independent of the distribution part. [14:49:14] Because we can delegate the get_data etc. to __loader__ [14:49:26] And if the __loader__ is the distribution, that works out nicely. [14:49:55] It does mean we won't support extraction for non-egg .zips, though. [14:50:20] Which I guess is a good thing because only eggs are reasonably guaranteed to have a unique name per version. [14:50:36] However, getting a string or stream will work with non-eggs. [14:50:52] And getting a filename will work with eggs and standard python modules. [14:52:30] Hm. More precisely I guess the __loader__ will not actually be the distribution object. [14:53:20] But that's implementation detail. [14:54:25] So, we have find_distribution, resource_string(modulename,resource), resource_stream(modulename,resource,mode='b') [14:54:38] resource_filename(modulename,resource) [14:55:22] set_extraction_path(path=None,temp=False) [14:55:29] cleanup() [14:56:25] And there's the question of putting a desired distribution on sys.metapath. [14:56:50] Not to mention filtering possible distribution choices [14:57:15] I'm also thinking that for OS X, 'resource_filename()' might detect shared libs and do the patching on extract. [14:58:49] Should we support other PEP 302 extensions, like get_source? [15:03:48] probably not, I don't see a good reason to [15:04:06] resource_filename, that makes sense.. easy enough to detect a Mach-O file [15:04:19] What should we call the API module itself? [15:04:22] starts with '\xfe\xed\xfa\xce' [15:04:25] ** [apoirier] has left IRC (Read error: 113 (No route to host)) [15:04:34] Feed face? :) [15:04:36] yeah [15:04:45] * pje shakes his head [15:04:56] I guess we're not the only developers having fun designing. :) [15:04:57] :) [15:05:18] well fat binaries start with cafebabe [15:05:25] * pje is typing up descriptions of the API so far [15:05:28] LOL [15:06:07] but you don't see fat binaries in the wild unless you work at apple [15:08:47] Hm. Looking at the API, it seems to me that it should actually be a manager class. [15:09:30] I mean, for the extract locations and such. [15:09:50] A default one at the module level, but no reason you can't create an explicit instance. [15:10:24] So, the manager supplies iter_distributions, set_extraction_path, and cleanup. [15:10:25] that makes sense [15:10:42] The resource stuff just delegates to the __loader__ anyway [15:11:05] The manager creates the distro, so the distro knows its manager for purposes of determining the extract location. [15:11:15] Also makes it a wee bit easier to conduct tests. [15:11:54] definitely [15:12:15] And to do explicit management of separate plugin paths, for example. [15:14:22] Anyway, the bootstrap loaders for extensions will just set their __file__ to resource_filename(__name__,"whatever.so") [15:14:25] And then reload() [15:14:51] They'll also have to call resource_filename on any .so's they depend on. [15:15:01] non-extension .so's or .dlls, I mean. [15:15:10] yea [15:15:21] For that matter, they'll need to call it for any files those .so's or .dll's depend on. [15:15:36] So, there will need to be a way to set that in the Extension in setup.py [15:15:53] on OS X, there is a difference betwee link-time and runtime loadable code (bundles vs. shared libraries) [15:16:14] .so is a MH_BUNDLE and shared libraries and frameworks are MH_DYLIB [15:16:19] * pje waits for the punchline [15:16:35] I'm just saying.. the dependencies are not .so, that's all [15:17:10] Ah. [15:17:14] .so what? ;) [15:17:42] Actually, this is a bit of a puzzlement. [15:17:58] Maybe what we should do is just punt and extract *everything*. [15:18:15] So, if you ask for one resource, you get 'em all. [15:18:22] that would be convenient but maybe not for applications [15:18:28] Why not? [15:18:56] well then you'd have all the python code unzipped too [15:19:13] We could skip those. [15:19:27] you're going to explicitly specify all the data files going in, so you can explicitly specify them coming out [15:20:25] how about just make data files (non-python-code) sit in a different location in the zip, and extract them all when the first iis asked for? [15:20:52] That'll complicate the build. And it won't be compatible with normal PEP 302 zip loader. [15:21:06] We could limit eager extraction to .so/.dll/etc. [15:21:15] and .pyd [15:21:34] That wouldn't help with external libraries that expect to find data files, though. [15:21:45] Although, how those libraries would know where to look for them anyway is beyond me. [15:22:06] hmm [15:22:27] well in the case of frameworks on OS X you can ask for files relative to your own bundle [15:22:34] Ah. [15:22:43] (frameworks are opaque directory structures of files with a shared library inside) [15:23:16] I guess we could unpack all non-Python files within that specific package subdir [15:23:44] The one countercase I can think of for just extracting all data files is what if you have documentation embedded? [15:23:45] well there are approximately zero people that use the loader's get_data [15:24:01] so having an API that isn't precisely compatible with it doesn't bother me [15:24:39] But we *know* what files are "data"; they're anything that isn't .py/.pyc/.pyo [15:24:56] ok [15:25:49] maybe we put each kind of thing in a different place? [15:25:58] like python extensions go in one place, so if you extract one you extract them all [15:26:05] dependencies (dlls, etc) go in another [15:26:28] and anything else (data files, documentation, etc.) would have to be asked for? [15:26:31] No, you want the dependencies extracted with their extension [15:26:45] ok, so we keep those together [15:27:16] Hm. [15:27:35] Maybe we should just have a metadata file that lists eager extracts. [15:27:42] ok [15:27:59] By default, it would list any libraries or extensions built by distutils [15:28:22] But we provide a way for you to add stuff to it. [15:28:34] alright [15:29:10] Then, resource_filename extracts everything on that list if you try to extract anything on that list. [15:29:24] makes sense [15:29:31] and we should allow you to ask for a directory too [15:29:52] Make it a separate function, resource_directory [15:30:17] Or are you saying, designate a directory for eager extraction? [15:31:07] why make it a separate function? [15:31:20] you're asking for a filename, a directory has a filename, and you can't have a namespace clash between a directory and a file [15:31:50] Because .zip files don't really have directories. [15:31:56] Just names on a path. [15:32:10] I mean, you can tell them to record an entry for the directory, but it's not really the zip-ish thing to do. [15:32:51] So really, resource_directory is sort of like resource_filename(..,f) for f in matching_names [15:33:20] but that's just because we're using zip [15:34:24] Is there a reason you wouldn't know you were asking for a directory? [15:34:42] not really, but I don't see a reason to have a separate API either [15:34:45] E.g. PEAK's "pkgfile:" URLs always refer to files, not directories. [15:35:30] It just seems more complex to me to include directories in the general filename case here. [15:35:39] resource_string and resource_stream won't work with a directory. [15:35:40] in a lot of cases on OS X you have documents that are really opaque directory structures [15:35:51] Ah. [15:35:57] like rtfd is a directory containing an rtf and linked images, sounds, etc. [15:39:58] Okay, so I guess we'll have to support it. [15:40:23] also imagine you have html documentation that is read by a browser [15:40:48] you'll need to have the whole thing [15:41:08] I'm thinking about HTML documentation served by the app itself. [15:41:14] But anyway. [15:41:27] that is a good idea, but might be hard in practice (starting a thread, etc.) [15:41:59] I'm thinking stuff like Zope that has docs built in. [15:42:11] But not as an exclusive use case; your point is valid anyway. [15:42:58] it's actually useful to know *every* file that will be extracted [15:43:02] eager or otherwise [15:43:14] Eh? [15:43:16] because in the case of creating an OS X application you will extract them all ahead of time [15:45:01] Useful for who to know? The extract facility? distutils? What? [15:45:27] when you build an OS X application you don't need zip files cause you have an opaque directory structure [15:45:31] the zip is useful for compressing code [15:45:40] but if you're going to rip files out then you may as well have them out already [15:46:21] So you're saying it's useful for py2app? [15:46:25] yes [15:46:42] Well, why not just extract everything to start with for py2app then? [15:47:07] the compression is good when you can use it [15:50:19] Btw, take a look at this: http://oscar.objectweb.org/cache.html [15:51:29] so they extract all data? [15:51:59] I'm not clear on that. [15:52:15] the example has a data dir, but it's not populated [15:52:17] You'll notice they're actually using .jars [15:52:32] yeah the "embedded.jar" [15:52:45] I think that the data dir there is like your "cookie" idea. [15:52:51] ah [15:53:16] I don't remember OSGi having that capability, but it's been a while since I looked at it. [15:53:22] I'll have to read that spec [15:53:40] It was definitely part of my original inspiration for this idea. [15:54:10] It's like eggs + dependency resolution + startup/shutdown + service broker [15:55:05] That's what that state and startlevel stuff is for; bundles can have states of "installed, resolved, running", or some such. [15:55:23] Where resolved means that you've figured out where its dependencies are [15:55:39] i see [15:55:52] With enough sophistication of our importer/loader objects we could probably actually do a fair bit of the resolution part. [15:56:30] But there'd also have to be some more manifest info in the egg. [15:57:22] We can leave hooks for that, though, if the importer/loader allows you to tell it names of modules, mapped to the place it should get them from. [16:01:19] API draft at http://peak.telecommunity.com/DevCenter/PythonEggs [16:03:24] Just updated it with more info about resource_stream not being the same as an actual file... [16:12:24] What do you think of maybe 'package_resources' as the API module? [16:17:31] * pje updates the API again [16:23:59] I thought you were going to have a manager class that has these methods? [16:25:14] Look at it again. :) [16:25:30] ah [16:26:07] we have no API to inspect a Distribution [16:26:38] but Distribution has no API yet.. so :) [16:26:51] Yep. [16:27:03] See the bit at the bottom of the latest version [16:27:15] yeah [16:27:48] I also just made ResourceManager constructor allow setting both search and extract paths. [16:28:09] I was going to say something about that earlier but I forgot [16:28:22] So is this everything we need that isn't part of the Distribution object's API? [16:28:44] seems to cover it as far as I can see [16:30:35] * pje is adding notes about the distribution format [16:33:33] Hm. Does zipimport need a path to have a .zip extension? [16:33:55] Actually, it must not, or py2exe wouldn't work. [16:34:26] ** bear_ has left IRC ("Leaving") [16:36:58] Hm. You can sublcass zipimporter; that's handy. [16:37:11] interesting [16:37:20] Thereby inheriting its bugs, however. :) [16:37:55] Anyway, it's not zipimport.c that detects zipfiles on sys.path, though. [16:40:43] Aha! [16:41:48] Bingo, I see how it works now. [16:42:03] It's not extension driven, it actually opens files on sys.path. [16:42:37] So, .egg will be usable with existing Python, as long as no extensions or other C-readable files are needed. [16:43:39] Okay, format notes are up. [16:49:14] If we subclass Distribution from zipimporter, and have resource managers cache distro path -> Distribution instance, we can take advantage of zipimport's zip directory caching. [16:56:50] * pje yawns [16:57:22] Guess we'll tackle the Distribution API next week. [16:57:45] Or more specifically, the metadata facilities both on the bdist_egg end and on the Distribution end. [16:58:34] Then after that we can consider version/dependency resolution and the like. [16:58:54] Mainly to make sure we have enough hooks for apps to use for that, not so much to solve the problem directly. [16:59:38] sounds like a good approach [16:59:44] And maybe add a 'require(name,version_info,path=None)' convenience function. [17:00:00] For e.g. Chandler bootstrapping itself and its dependencies. [17:00:13] Hm, maybe require_distribution. [17:02:02] We're probably only one more session away from being able to code up a bdist_egg. [17:02:17] Maybe two before we can fully code up package_resources. [17:02:47] Or the other way around, I'm not sure. Obviously I'm getting too tired to continue this today. :) [17:03:38] Also on the to-do list will be figuring out how to handle platform-specifics like the Darwin header rewrites and the Windows file locking. [17:03:51] But I'm really really going to stop designing now. Really. :) [17:04:09] :) [17:04:30] sorry I haven't been able to give this my full attention today, been working too.. [17:04:43] No problem. [17:05:46] I'll try to have some kind of rough proposal for the metadata stuff for next time. [17:05:57] So you can just pick holes in it instead of having to invent from scratch. :) [17:06:12] cool [17:06:27] * pje yawns again [17:06:30] I've been thinking about what to use for metadata and it doesn't sound like we have a lot of great options in the stdlib [17:06:42] ConfigParser sucks, the MIME stuff is a pain in the ass.. maybe we should just use XML [17:06:50] * pje shudders [17:06:54] Heh. [17:07:05] I don't think we need to restrict to a single one. [17:07:19] I'm thinking that we have something like EGG-INFO directory that contains metadata files. [17:07:32] the problem with ConfigParser is that it's just strings and numbers, we probably want lists too [17:07:34] with 'egg-*' filenames reserved for the egg system itself. [17:07:46] You're thinking too generically, I think. [17:07:55] listing dependencies? [17:08:07] so we want to store that as a string and split it ourselves? [17:08:10] I'm thinking 'egg-eager-resources' contains just a text list of resource filenames, one per line [17:08:33] Keep in mind that some of these files the author will create themselves as part of setup.py-related files. [17:08:46] I was talking about the kind of metadata that we were discussing yesterday [17:09:00] Publishing metadata? Isn't that completely external to the egg? [17:09:18] yes but shouldn't some of it also be stored in the egg? (I depend on this that and the other) [17:09:39] for assembling eggsecutables :) [17:10:17] EGG-INFO/dependencies [17:10:26] One per line, name + version specifier [17:10:51] I suppose we could add a setup keyword though for autogeneration. [17:10:56] alright [17:11:15] Actually, setuptools already has some dependency support, come to think of it. [17:11:46] But it's kind of waiting for publishing metadata before it can automatically download-and-assemble. [17:12:16] Basically, a packager could spec dependencies in setup.py and include metadata URLs. [17:12:16] I'm not too concerned about the download/assemble/blah stuff yet.. I think we should make a release without that [17:12:49] I'm just saying that we could possibly base dependency specification off of setuptools dependency objects. [17:12:58] it's really useful as-is as an alternative to zip import, even if just used internally by py2app/py2exe [17:13:12] and it solves the "where's my data?" problem that some apps have [17:13:17] Yep. [17:13:17] some libraries anyway [17:14:05] we should have some way to say "i need these data files when packaged" or "i don't need this when packaged" [17:14:08] Anyway, first version we should only include metadata that is needed for bdist_egg and package_resources [17:14:14] agreed [17:14:28] Publishing is a catalog-sig discussion anyway, not distutils-sig. ;) [17:14:31] for example, with PyObjC, you have megs worth of tests that you probably don't care about [17:14:55] Using setuptools you can do that with "features" and --with-X/--without-X flags. [17:15:15] Although that might not be optimal, I suppose. [17:15:25] with PyObjC there are 13 packages and each of them has a "test" subpackage [17:15:35] in total it's over 8mb [17:15:38] Have you seen PEAK's setup.py? [17:15:40] mostly in objc/test [17:15:41] not recently [17:16:02] Here's the relevant part of it: [17:16:03] features = { [17:16:04] 'tests': Feature( [17:16:04] "test modules", standard = True, [17:16:04] remove = [p for p in packages if p.endswith('.tests')] [17:16:04] ), [17:16:15] that sounds like what we want [17:16:28] I'll put that into PyObjC 1.3 [17:16:34] the whole features thing [17:16:47] The setup.py global option help then reads: [17:16:48] Global options: [17:16:48] --with-tests include test modules (default) [17:16:48] --without-tests exclude test modules [17:16:56] great [17:17:05] cause it would be nice to make the binary package not include tests [17:17:11] cause nobody will ever run them outside of the build dir [17:17:27] One problem, though... [17:17:43] There needs to be a way to lock that to the distro mechanism. [17:17:56] Because you don't want eggs with the same name, but different features included. [17:18:21] So, setuptools' feature mechanism might need an extra flag. [17:18:43] The main problem I see is that some parts of distutils build processes won't clean up extra files. [17:19:10] So if you don't 'setup.py clean' and then change options, you're gonna make a mess. [17:19:43] Or if you 'setup.py build' and then 'setup.py --without-tests bdist_egg', you'd get the wrong stuff. [17:20:10] hmm [17:20:17] Probably the "right" way to fix this would be to store the selected features in the build dirs somewhere, and rebuild if the current options don't match. [17:20:17] yeah [17:20:25] Sort of like ./configure [17:20:31] I also have this sort of problem with py2app [17:20:51] there are three build modes, alias, semi-standalone, and standalone [17:21:04] if you switch build modes without cleaning, then it doesn't do the right thing [17:21:22] Would those be representable with features? [17:21:38] Guess not. [17:21:42] well the difference between semi-standalone and standalone is a feature, sort of [17:21:50] it's whether Python is included or not [17:21:58] alias is a different build mode entirely [17:22:06] could even be a different command [17:22:08] Anyway, sounds like the main thing is just to store flag files in ./build for these sort of things. [17:22:08] maybe should be [17:22:30] well my problem is with dist, not build actually [17:22:43] because when I bulid applications it's a merge operation not a remove and replace (for speed) [17:22:59] Ah. [17:23:18] it knows how to collect dependencies correctly, it doesn't blindly copy trees out of build [17:23:20] Then you could have a flag file within the app itself, since apps are dirs, right? [17:23:25] yeah [17:23:39] that's probably how I am going to solve it [17:23:44] I'm just saying I have a similar problem [17:24:10] In the case of features, we need to save only features changed from the default state. [17:24:27] Otherwise, defining a new feature while developing would force you to clean and rebuild. [17:25:24] and that can be annoying [17:25:55] in the case of bdist_mpkg or bdist_egg you will probably actually want to change the default [17:26:00] I guess all build_* commands would have to check features. [17:26:03] rather than explicitly have to say --without-crap [17:26:16] Right. [17:26:18] cause we will never want to create a binary distribution with tests [17:26:30] Ah crap. [17:26:57] That means I have to treat egg feature option defaults as non-default, for purposes of checking. [17:27:00] Ugh. [17:27:02] it doesn't really change anything that would need a rebuild, it's more of a "just don't copy this, please" [17:27:26] Okay, better idea... [17:27:39] in the case of bdist_mpkg it actually puts build files in separate places than "python setup.py build" does [17:27:41] List all known features in the file, with + or - to indicate state. [17:27:57] Then, if a new feature is added, it wasn't in the old file, so you just go ahead and build. [17:28:15] Likewise if a feature was in the old file but not in the new. [17:28:30] You know that those are changes to setup.py, not option changes. [17:28:46] So, you only have to clean and rebuild during development if you rename a feature. [17:29:25] Hm, not even then, actually, since that's just a delete and an add [17:29:36] So only changes in which features are on or off force a clean. [17:29:55] adding a feature requires a build to be issued, removing a feature just constitutes not copying something into the distribution [17:30:12] I'm not even going down to that level. [17:30:22] "build" is cheap anyway, it knows when not to do stuff [17:30:25] I'm assuming that build commands will just run a clean command first. [17:30:31] If the options don't match. [17:31:13] By default, extension build directories aren't deleted by clean, so you don't slow down compilation. [17:32:21] Hm, I may have spoken too soon. [17:33:31] Oh great, it doesn't actually clean out .py files either. [17:33:45] Or data files. [17:33:51] What *does* it actually "clean"? [17:34:09] I have no idea [17:34:13] I always rm -rf build [17:34:21] Me too, actually. [17:34:55] Apparently it only removes temp.platform, not lib.platform [17:35:24] I guess that makes sense considering it doesn't support anything like featuers [17:35:26] features [17:35:52] all of its options control where stuff goes and whether you get .pyo or not [17:37:12] Hm. It definitely packages everything in the distro directories, though. [17:37:23] At least for bdist_wininst [17:37:34] I'll try bdist_zip [17:38:08] my bdist_mpkg uses its own schema for distro directories so it ends up with exactly what it wants [17:38:41] because it separates scripts, platlib, etc. into separate packages [17:39:12] Okay, it's dumb too. [17:39:32] So features aren't going to run a clean command, they'll have to wipe the build/lib dir. [17:40:18] is there any way to just give features a way to spit out a manifest of the files they own? [17:40:27] Ugh. [17:40:29] No. [17:40:40] Distutils commands control what they own. [17:40:57] And some commands just copy whatever they find in the build directories. (obviously) [17:41:25] the smart ones, obviously :) [17:42:20] actually nevermind that, bdist_mpkg doesn't change build dirs except when building subprojects [17:42:22] Yep, only way to remove features w/current setuptools is to nuke the build directory. [17:42:24] it just changes the install dirs [17:43:07] well it wouldn't be terribly hard to have a way to just say "exclude *.test" [17:43:18] without trying to contort distutils to do something intelligent [17:43:26] Problem is that Feature objects are more sophisticated than that already. [17:43:45] They can either add or remove items. [17:43:47] well forgetting feature objects for a second, bdist_egg could know how to do this [17:43:50] Or both. [17:44:13] Hm. I guess. [17:44:29] Hm, I forget whether features can specify data files as well. [17:45:10] I mean the difference we're talking about here is whether the packager person says "exclude Test feature" or "exclude *.test" [17:45:19] the latter is more useful for everything but PEAK [17:47:49] I think you're missing the point; I was trying to avoid that degree of complexity [17:48:12] The Distribution already knows what packages to include or exclude. [17:48:18] Modules, in fact. [17:48:48] It's mapping that into filenames to put in the archive that's more of an issue. [17:49:02] so from a setup.py how do you say I want an Egg distribution that has the objc package but not the objc.test subpackage? [17:49:39] Already showed you that. [17:49:59] that's assuming that the tests are a feature [17:50:02] Features aren't the problem here; if we use the existing bdist_dumb or similar build mechanism, it just dumps whatevers in the build dir. [17:50:11] (whatever's) [17:50:41] I was trying to avoid having to rewrite all that stuff, because as written in distutils it doesn't know about packages, it just copies trees. [17:50:49] And byte-compiles them, if applicable. [17:50:57] byte compilation is another step [17:51:22] well I'm not afraid of any of this code because (a) I've rewritten distutils copy_tree because it sucks and (b) I've rewritten byte_compile because it sucks [17:52:10] Interesting -- the byte-compiling part pays attention to features. [17:53:15] I think I had to rewrite copy_tree because it had some problem with symlinks when doing an update=1 build.. or something [17:54:27] Ah. [17:54:36] And you didn't submit a patch? ;) [17:54:55] nah [17:55:05] I did think I submitted a bug about making run_setup(..) re-entrant [17:55:09] I don't think they did anything about it though [17:55:16] Hm, well, only byte-compiling respects features ATM. [17:56:19] the way that bdist_mpkg works for PyObjC is that it includes its dependencies (py2app in this case) by issuing a run_setup to py2app, so it builds the py2app metapackage and just includes it [17:56:28] but the problem was that py2app uses run_setup to pre-build its applications [17:56:39] so run_setup -> run_setup -> run_setup = Boom! [17:56:45] because it uses module-global variables [17:56:55] (didn't people learn not to do this in C?) [17:56:56] Oh, run_setup is well and truly fscked. [17:57:05] it's not *too* hard to fix [17:57:10] http://svn.red-bean.com/bob/py2app/trunk/src/bdist_mpkg/tools.py [17:57:35] rather than rewriting it I just preserve its module global variables [17:57:56] if they eventually fix it to not use those variables, my code will still work [17:58:11] Not bad. [17:59:22] it's going to be really exciting to just be able to download an egg and put it somewhere and it will work [17:59:49] it might actually obsolete bdist_wininst [17:59:51] and bdist_mpkg [18:00:00] I think it'll actually revolutionize packaging of upstream apps. [18:00:20] I mean, I wouldn't care about bundling stuff like fcgiapp, ZConfig, docutils, etc. for PEAK. [18:00:57] Over the long-term, it might make it easier for people to just use other libraries, and cut down on the "sumo" library tendencies that exist today. [18:01:33] yeah [18:01:38] Dunno if it would obsolete bdist_wininst, though. [18:01:47] well what if you can right-click an egg and just say "install" [18:01:49] Might make it worth overhauling at some point to just be an egg wrapper. [18:02:03] or if the document association for a .egg is to install it [18:02:05] I was thinking that double-clicking an egg without a __main__ should install it. [18:02:31] In other words, the egg launcher runs application eggs but installs non-application eggs in site-packages. [18:02:41] what about libraries that are also applications? [18:03:09] I guess you could have the right-click thing for those. [18:03:18] do we just tell them to make a separate egg for the application bits? [18:03:29] That's actually my inclination. [18:03:33] maybe we should make eggsecutables a separate extension [18:03:44] Why bother? [18:03:54] document associations [18:04:17] ISTM double-click = run or install, right-click = install [18:04:18] if there is one extension, you need one application that needs to be able to run eggs and install eggs [18:04:34] if there are two extensions, these can be separate apps [18:05:09] Actually, this is silly... just tell people to put eggs in site-packages. [18:05:17] We don't really need an install facility. [18:05:37] but then the egg runtime needs to be imported by a pth file [18:06:06] Nah, just "require()" [18:06:20] You'll want people to get in the habit of using "require()" anyway. [18:06:48] I like when people use import because then bytecode analysis can determine dependencies reliably [18:06:55] The egg launcher, alas, can't be written in Python itself. [18:07:00] for some definition of reliably [18:07:09] require() doesn't import, it just sets sys.path [18:07:25] well the egg launcher needs a smidgeon of code that isn't written in Python, of course [18:07:26] So you can still find imports, that just doesn't tell you what eggs you need. [18:07:33] but that's the bootstrap [18:07:59] python source code isn't a PE executable, you need machine code somewhere [18:08:05] Hm. We actually *do* need a different extension for eggsecutables. [18:08:29] That's not what I meant; it's that the launcher has to figure out *which* Python to run. [18:09:01] the py2app bootstrap looks in an XML file to determine which python runtime it should link to [18:09:02] Anyway, you don't want to have to have eggsecutables named with a version in them. [18:09:08] And you don't need to find them on a path. [18:09:11] that's true [18:09:14] also true [18:09:34] Actually, I'm pretty close to concluding that eggsecutables are a non-starter. [18:09:43] let' [18:09:45] A .pyw file is sufficient. [18:09:59] I'll go with that [18:10:13] If you want a true .exe or app, use py2exe or py2app to wrap it. [18:10:16] yes [18:10:21] that's what I was just going to say [18:10:35] So, you put your actual application in an .egg as MyApp-versionnumber.egg [18:10:50] And then have MyApp.pyw or MyApp.exe or whatever as the start script. [18:10:50] there is a use case for eggsecutables, at least, people have implemented something similar to that [18:11:23] some guy in the pygame community has created a launcher application that includes pygame and several other useful modules and allows you to use it to run small python packages [18:11:52] so that you don't have to distribute all of python, all your dependencies, etc. with your application. you just distribute it as an egg and say that you also need the egg launcher [18:12:39] Not sure what that gets you that py2exe and a pygame egg doesn't. [18:13:12] well if you py2exe an egg it will presumably include all of the dependencies with it so that it's standalone [18:13:25] or at least have a (default?) option to do so [18:13:36] Hm. [18:14:02] I don't really care so much about that use case, as it's something that can be easily solved with the egg runtime by someone else [18:14:08] I'm just saying it's out there [18:14:21] That actually sounds a lot like the RPM use case, for installing eggs in a platform location. [18:14:31] yeah it's kinda similar [18:14:49] A way to say, "I want these dependencies, but don't put them in my executable" [18:14:59] A list of "batteries not to be included" :) [18:15:44] yeah [18:16:02] but in that case, with the pygame launcher or whatever it's called, the dependencies not to be included is always * [18:16:09] it's up to the user to put together their zip file [18:16:28] which can use the pygame, PIL, wx, PyOpenGL, whatever that happens to be inside pygame launcher (a fixed set of requirements) [18:16:31] Well, that's a py2exe/py2app matter, not a bdist_egg matter. [18:16:37] So I'll leave that to the experts. :) [18:16:38] yes [18:16:50] so we punt on eggsecutables [18:17:09] PEP should include explanation, though. [18:17:13] eggsplanation. :) [18:17:17] lol [18:17:30] we'll need lots of eggsamples [18:17:46] Omelet you supply those. :) [18:18:37] the egg name is definitely going to stick (to the pan) [18:19:04] We'll see. [18:19:18] Guido keeps calling my examples in other documents "too cute" [18:19:58] well I think we can get away with the name "egg".. he who writes the code wins [18:20:02] But practically speaking, this PEP will live or die by the distutils-sig reaction, since it's not a core Python feature. [18:21:35] well I can almost guarantee that Twisted is going to like this since they are spitting up Twisted 2 into several packagaes [18:21:59] * pje has occasionally spit up Twisted, too :) [18:22:04] and I can ensure the Mac community will be behind it [18:22:45] All one of you? ;-) [18:22:49] j/k [18:23:19] I can make sure that twisted, py2app, pyobjc, pygame care about egg [18:23:28] and I'll be making eggs personally for PyOpenGL, PIL, and several other packages [18:23:46] Yeah, and I'll probably make eggs for PEAK, and all of Chandler's dependencies. [18:24:10] PIL may not eggcept it to the main source tree because they are still 1.5 compatible [18:24:10] Chandler is a, well, snake's nest of dependencies. :) [18:24:37] but if we can get some setup.py machinations that are 1.5 syntax compatible then it might be able to live in a try:except: [18:25:05] I'll let you worry about that one. :) [18:25:39] well if effbot likes it enough he will do it himself [18:25:58] What are the odds of *that*? :) [18:26:02] I gave him plugins for some new file formats and he accepted one of them and did the 1.5-ization himself [18:26:09] Ah. [18:26:18] so the latest PIL beta supports the Mac OS X .icns format [18:26:33] based upon Python 2.3 code I wrote, backported to 1.5 by effbot :) [18:26:34] So, how are we on EGG-INFO for metadata dir, and eager_resources for filename? [18:26:44] sounds good to me [18:27:04] possibly it should have a .txt? [18:27:08] I dunno [18:27:08] Yeah... [18:27:31] Actually, two files... native_libs and eager_resources [18:27:34] for Mac OS X and Windows it's nice to have extensions because it means it will open on double-click with something other than an annoying "what do I do with this?" dialog [18:27:44] native_libs is distutils-generated, and eager_resources is handmade. [18:27:49] That way, there's no edit cycle issue. [18:27:59] sounds good [18:28:18] Now, where do we put the directory in setup.py dir? [18:28:19] where the implementation simply concatenates these two files and considers all lines to be eager [18:28:26] Yes, exactly. [18:28:50] what do you mean? [18:29:31] Where do you put eager_resources.txt? [18:29:43] And any other files bound for EGG-INFO? [18:30:04] And do we need a build directory for EGG-INFO, that merges with hand-specified files? [18:30:23] (merge = copies in source files + distutils-generated ones) [18:31:12] well distutils already understands a manifest.in, perhaps it should be a sibling of setup.py like that? [18:31:41] I'd rather a directory, because other apps may have EGG-INFO files. [18:31:56] ok, fine, then what about an EGG-INFO as a sibling of setup.py? [18:31:58] For example, if WSGI ever gets a deployment config format. [18:32:01] Sure. [18:32:20] that makes it cleaner especially once we have more than one egg-related metadata file [18:32:37] Come to think of it, we do need a build dir for EGG-INFO, because native_libs content will differ by platform. [18:33:00] true, that would be like the MANIFEST output though.. since you never write that by hand [18:33:21] So egg build will build its own egg info files in a build directory, then copy in user-supplied EGG-INFO. [18:33:41] yeah [18:34:14] in most cases people won't even need an EGG-INFO [18:34:18] PKG-INFO can also get stuck in there, so EGG-INFO/PKG-INFO (or whatever it's called) contains general metadata. [18:34:33] At least if we spec dependencies via setup.py, they won't. [18:34:47] yeah [18:35:19] OTOH, we could just have an egg_info argument to setup that lists files to include. [18:35:41] That might actually be simpler. [18:35:56] For the user, I mean, if the common case is that you only have one file if you have anything. [18:36:25] If it can be a glob, then you can always set egg_info=['EGG-INFO/*']. [18:36:46] Hm. In fact it could default to that. [18:37:38] It doesn't really add complexity since the distro object is going to have to know a glob to find the files by anyway. [18:39:03] that sounds good [18:39:09] especially w/ the default [18:40:18] * pje updates the Wiki page [18:40:34] Check it out... [18:40:42] Did I get everything we just went through? [18:41:30] (Not counting app/eggsecutable stuff, that is.) [18:41:55] That covers it as far as I can tell [18:42:08] I'm not totally hot on the 'package_resources' module name, but I don't have a better idea yet [18:43:37] Yeah, I didn't want to get cutesy like 'eggshell' or 'suck_eggs' :) [18:44:21] I was just thinking that less verbose might be nice [18:44:39] Maybe just 'resources'? [18:44:40] pkgdata is probably ok to use since the only module using that is one of mine and I don't think anyone ended up using it [18:44:48] Hm. [18:44:54] well there is a "resource" module [18:44:54] pkg_data might not be bad. [18:45:03] pkg_data works for me [18:45:22] 2 hits for pkg_data python on google [18:45:25] safe! [18:46:04] Only 2 for 'pkg_res' [18:46:10] Without "python" [18:46:39] 2 hits to (effectively) the same URL, even [18:46:49] The API does mention resources a lot more than "data". [18:47:13] "rsrc" is a more common abbreviation for resource though [18:47:26] pkg_rsrc has zero hits on google [18:47:34] And with good reason. :) [18:47:38] It's a crappy name. :) [18:47:41] * pje grins [18:47:54] Although pkgrsrc is admittedly worse. :) [18:48:06] it's WAY too close to pkgsrc [18:48:12] Perhaps pkg_resources is compact enough for your taste? [18:48:29] pkgsrc is the NetBSD packages mechanism/collection [18:48:39] * pje uses it [18:48:46] On Linux, actually. :) [18:48:49] hehe [18:48:56] I kid thee not. [18:49:04] Ty got me hooked on it. [18:49:14] I definitely prefer pkg_resources to package_resources [18:49:28] Okay, then we'll go with that for now. [18:49:36] I'm not sold on either, but the least worst by a large margin [18:49:57] "pkg" is already a common abbreviation in Python, so that's safe [18:50:22] pkg_resources has only 28 total Google hits. [18:50:34] And only 7 non-dupes. [18:50:42] anything under 2000 is easy to fix if we both blog about it [18:50:49] it would be fixed within 24 hours [18:50:50] Heh. [18:51:18] It's nice to be popular, isn't it? [18:51:45] More to the point, once it's so much as a PEP it'll be on top. [18:51:56] I'm not sure it's popularity but the way that blogs work totally countermines page ranking algorithms [18:51:56] Let alone if it becomes a stdlib module with docs. [18:52:14] because of meta-feeds and blogrolls [18:52:30] Ah, if only that were true. All my blog mentions of my wife's store haven't boosted her site's page rank any. :) [18:52:45] that's because your wife's store is not itself a blog [18:53:02] Neither will the pkg_resources doc be. [18:53:05] if your wife's store had an rss feed and submitted itself to the meta-feeds, it would increase hits [18:53:28] Hm. Interesting point. [18:54:00] Even if it was just store specials? Interesting. [18:54:04] it's actually semi-common for non-blogs to use movabletype or wordpress for their news page and submit to the metafeeds [18:54:15] store specials only may even be better [18:54:48] What are these metafeeds? Do you mean like bloglines? [18:55:33] technorati, weblogs.com, blo.gs, etc. [18:55:36] http://wiki.wordpress.org/UpdateServices [18:55:47] that's wordPress's listing, plus the meta-meta-feed pingomatic [18:57:04] Cool, thanks for the tip! [18:57:17] * pje is adding a 'require()' API to the Wiki [18:58:13] "celebrity press" is another big one.. I mean, of course, you're not going to get hits from gizmodo or boingboing, but you might be able to get an "industry" link or blogroll [18:58:30] it would be a good idea if your wife started a blog talking about the store or store-related stuff [18:58:54] you might be able to get a link from another blog like http://www.nyhotties.com/ which has a lot of readers [18:59:09] Hm, there may not be such an animal for that industry yet. But it's definitely worth looking into. [18:59:40] Okay, I've got a placeholder for 'require()' up, punting for now on the format of version-info [19:00:14] If we wanted to be really ambitious, we could now start fleshing out the Distribution API. [19:01:02] the only boingboing-like industry site I can think of re: your wife's store is http://fleshbot.com/ [19:01:07] which is owned by gawker media [19:01:29] who also do gizmodo [19:01:34] and wonkette [19:01:36] right? [19:01:40] yeah [19:01:50] Since metadata is now in a known location, we could have get_metadata(filename) to retrieve EGG-INFO files [19:02:55] Hm. Or maybe just metadata_string, metadata_stream, metadata_filename? [19:03:34] since it's metadata, let's ONLY have a string API [19:03:42] forget that other crap, we don't need it [19:04:14] What I'm thinking, though, is that we can define the resource_* stuff in terms of Distribution.get_*. [19:04:56] So, resource_filename(modname,resource) is actually a front for sys.modules[modname].__loader__.get_filename(pkgprefix+resource) [19:05:13] And ditto for the other APIs. [19:05:21] except when modname doesn't have a loader, in which case a separate code path happens [19:05:26] Right. [19:06:10] So, any Distribution built-in metadata support (like eager/native stuff) just delegates to self.get_string("EGG-INFO/eager_resources.txt") [19:06:20] but why even bother making those aliases for filename/stream for metadata? [19:06:23] Hm, actually, a has_resource(resourceName) would be good, too. [19:06:45] Not for metadata, just saying that you can use get_string/get_stream in the same way. [19:07:26] that could work, since those APIs are only used internally anyway [19:08:42] You know, I think I might've just found a problem with zipimporter.get_data. [19:08:56] what's that? [19:09:35] It doesn't treat the path as relative to the module or package [19:09:45] It treats it relative to the zipfile. [19:10:05] yeah but you ask for the __name__, so we can fix that [19:10:17] Yeah, we can. [19:10:52] It's just that that means all the crap I put in PEAK to support loading from a zipfile is moot. [19:11:10] I'm not sure if that's good or bad :) [19:11:32] but yeah I realized that problem with the zipimport sutff a long time ago [19:11:33] Well, I guess it's another good point for eggs, anyway. [19:11:56] we might have a convenience API that takes __name__ from _getframe [19:12:17] depending on how much we care about alternate implementations of python [19:12:24] I don't see a need for that. [19:12:38] It's strictly zipimporter.get_data that's broken. [19:12:54] but creating the pkg_resources object [19:12:59] And only in the context of using it directly. [19:13:10] you still need to specify __name__ [19:13:17] in the context of CPython we can use sys._getframe [19:13:25] We don't need it. [19:13:34] brb [19:14:28] The ResourceManager API always requires a __name__, and when you access the loader directly, you'll be using it for metadata like EGG-INFO/something. [19:14:38] So we never need to guess about __name__. [19:17:11] Also, I'm wrong about zipimporter being broken; it actually follows PEP 302. I just misunderstood PEP 302's spec for get_data. [19:20:37] So I guess I'll fix PEAK to follow PEP 302. :) [19:22:32] well what I was speaking of is making the RM API not need a __name__ using sys._getframe for convenience [19:22:37] but that convenience isn't really worth it [19:23:44] Anyway... we can use zipimporter get_data as is for implementing both resource_string and resource_stream [19:23:58] yeah, because we have __name__ [19:24:00] And for reading metadata. [19:24:16] Actually __name__ isn't the important thing, it's __file__ [19:24:36] but __name__ gets you __file__ by way of sys.modules [19:24:55] Yeah. [19:25:06] Just saying that get_data wants a __file__-based name. [19:25:12] yes it most certainly does [19:25:23] it's just like the way os.path.dirname(__file__) works [19:25:30] except you ask for the data in a different way [19:25:40] Anyway... so Distribution will offer get_data and we can use it to read EGG-INFO [19:26:05] And it will offer get_filename, which is what will implement the hard parts of resource_filename. [19:26:22] Or maybe it should be get_extracted_filename. [19:28:52] either way works for me.. no real preference [19:29:06] get_extracted_ certainly discourages, which might be good [19:29:15] just by method name length [19:29:34] Yep. It's really intended for use by the ResourceManager APIs. [19:29:53] Hm, actually, come to think of it, it should require a ResouceManager parameter. [19:30:06] get_extracted_filename(resource_manager, path) [19:30:41] So that the stuff is extracted or cached in the right place. [19:31:32] it wouldn't be a method of the resource manager? [19:31:57] No, because the resource manager's job is file/path allocation, and maybe locking. [19:32:01] ok [19:32:09] Distribution's job to do the actual extract [19:32:19] And to know what resources are eager, too. [19:32:37] and the distribution doesn't already have an RM reference? [19:33:01] The question is should it be overridden if you explicitly request via a non-default RM? [19:33:23] if you wanted that, wouldn't you *get* the distro from a non-default RM? [19:33:25] If I explicitly create a resource manager on such-and-such a cache path, and then request a resource from it, shouldn't it use that path? [19:33:40] RM takes __name__, not distro. [19:33:45] ok [19:33:54] If you have the distro, you could pass a non-default RM. [19:34:01] I'm not sure if this actually makes sense, mind you. [19:34:25] It may be that if you ask the default RM for a resource, it should use whatever RM loaded that module. [19:35:10] But use of an explicit RM suggests you want it from *that* RM, darnit. [19:35:20] makes sense [19:35:32] so the RM should be a kwarg [19:38:04] That doesn't address the issue of whether resource_filename() should pass it or not, though. [19:38:44] Or, at the opposite extreme, should resource_filename() fail if the module wasn't loaded by the designated RM? [19:38:47] Nah, too strict. [19:39:03] Most of the time, you want to use the default RM to get resources. [19:40:13] So, probably resource_filename should *not* pass it, if it's the default RM. [19:40:54] Bleah. [19:41:04] yeah [19:41:14] I don't like this much, it seems to me to be saying that the convenience API shouldn't be glued to an RM. [19:41:14] who would use a non-default RM though? [19:41:37] A program might use a different RM for core eggs than for plugins. [19:42:13] For e.g. different "instances" of an app. [19:42:32] Like Eclipse has "workspaces", each can have its own plugins installed. [19:42:37] I think, anyway. [19:43:00] OTOH, there's not much to be gained by having something special. [19:43:07] Special cache, I mean. [19:43:16] yeah there is not much at all [19:43:20] You're not going to have conflicts, so it doesn't really mean anything. [19:43:34] It only has meaning in the context of data for eggs. [19:43:38] Egg cookies. :) [19:43:50] yeah [19:45:15] Maybe ResourceManager class should be undocumented, then. [19:45:30] We can always surface it if a use case emerges. [19:45:58] likewise get_extracted_filename() can be private. [19:49:39] yeah [19:49:53] makes sense to me [19:50:51] * pje is making the changes [19:52:46] Okay, removed RM class/info, added a bit of Distribution signature. [19:53:49] We get 'archive' attribute for free if Distribution is a subclass of zipimporter. [19:53:57] And 'get_data', for that matter. [19:54:06] As well as all other PEP 302-required stuff. [19:55:06] So, we just need some methods to set name, version, Python version from archive name [19:55:48] And some other methods to load metadata from EGG-INFO. [19:56:08] Oh, and we should add a exists_resource(path) [19:56:33] Don't need it in the convenience API, but for metadata it's meaningful. [19:56:50] as opposed for asking for the resource and catching? [19:57:34] I suppose I see your point. [19:57:42] No... [19:57:51] zipimporter raises ZipImportError for every damn error. [19:58:04] So there's no way to distinguish from some other problem. [19:58:19] It's either that or give get_data an optional "default" argument. [19:59:16] maybe re-raise with a better error? [19:59:33] The default is probably better since it lets you just do get_data("EGG-INFO/native_libs.txt", "") [19:59:47] default is nice to have, definitely [20:01:38] Okay, added to the Wiki; skipped the existence check, although we'll have to have it internally. [20:02:10] I think that's about as far as I can go tonight without frying my brain. [20:02:38] :) [20:02:52] This looks really good, though. [20:03:14] There's hardly anything left that needs to go in the public API [20:03:20] totally [20:03:21] Just name/version stuff. [20:03:49] Privately, the distro needs to be able to build native+eager resource list, and various other odds and ends [20:04:10] And when we move up to cataloging level, dependency metadata would be nice. [20:04:18] But really, that's about it. [20:04:44] yeah [20:05:01] "So all we have to do, is build it..." [20:05:11] (Ever seen "Real Genius"? :) ) [20:05:17] haha yeah [20:05:27] I actually have a reproduction of a tshirt from that movie [20:05:37] * pje hums the theme from the Project Crossbow demo film [20:05:52] http://founditemclothing.com/t-shirts/toxic.html [20:06:06] Oh, the I :> Toxic Waste one? [20:06:08] yeah [20:06:52] * pje would be afraid of being lynched by environmentalists [20:07:18] Do they have a People Eating Tasty Animals? :) [20:07:19] I live in NYC, not so much of a problem [20:07:55] not that I know of [20:07:58] I can't believe we've really done it... [20:08:20] Python on the half-shell... :) [20:08:27] what, quarter-solved a problem python has had forver? :) [20:08:45] Quarter-solved? [20:08:52] we haven't written any code yet [20:08:59] Oh. Right. :) [20:09:54] But now we know what to write. [20:10:03] yea we do [20:10:23] Hm.. forgot something... [20:10:39] Need to put in a note about generating loader stubs for extensions. [20:11:03] yeah [20:11:10] not as hard as it soudns though [20:11:16] And we also need to be able to configure whether .py is included, or just compiled forms. [20:11:19] cause it's already done in py2app an py2exe [20:11:42] I would say just compiled, since we are depending on python version [20:12:24] You might want it for debug purposes, although I'm not sure you can *get* it currently with traceback stuff. [20:12:27] That might be another patch. [20:12:30] (for 2.5) [20:13:01] But I'm sure Python IDEs will want to support step-through/etc. of library code from "debug build" eggs. [20:13:09] that's true [20:13:13] ok [20:13:41] So, I would say it's an option to remove source, for if you're doing something closed-source or really space critical. [20:13:49] But text compresses well so I don't worry so much. [20:14:23] yeah it does [20:16:17] Okay, that's noted. So I think the to-do for bdist_egg is now more complete. It's a pretty extensive list, although it starts out with very little. [20:18:00] Both it and pkg_resources are going to be a real pain to actually write. [20:18:28] Do you have Python CVS commit privs? [20:18:29] pkg_resources not so much, bdist_egg definitely [20:18:31] no I do not [20:18:52] Well, pkg_resources is going to have to have locking crud for Windows. [20:19:01] But it is relatively constrained in scope. [20:19:05] ah right [20:19:12] that's not going to be me :) [20:19:29] bdist_egg will suck mainly because distutils' internal architecture isn't really suited for what it does. [20:20:01] So, we have to change Distribution, Feature, and clone up a bunch of stuff that's been done in other parts already. [20:20:04] yeah distutils sucks [20:20:23] OTOH, distutils sucks the way democracy does... :) [20:20:33] true [20:20:48] distutils is the worst way to build something, except for the alternatives. :) [20:21:31] Interesting thought, though, which is that the reason it sucks is that it has crisscrossing data and methods [20:21:42] I mean, you can add either new data or new methods, and you have to fix up all that apply. [20:21:43] and LOTS of it [20:21:56] Sounds like a good example of the sort of thing generic functions are good at. [20:22:32] But it'd be insane to try to replace the distutils, because there are so many platform hacks and bugfixes. [20:22:39] yup [20:22:49] You'd be restarting the whole bugfix process from scratch; not a pretty sight. [20:22:57] not in the least [20:23:17] at least we have something [20:23:27] imagine python as-is without distutils [20:23:36] Yeah, and once we've got bdist_egg, how many other formats do you need anyway? :) [20:23:54] less than zero, hopefully :) [20:24:02] Java tools just build jar variants like ear, war, etc. They're all just jar with more metadata. [20:24:51] It's mainly new *commands* you need for distutils, then. [20:25:22] Problem is that to add commands you have to hack on the Distribution object if you need more data. [20:25:38] Hmm... here's a thought... [20:26:00] Suppose you had Target objects you could just add, that were Make-like targets. [20:26:13] And 'build' would run 'build_targets [20:26:23] Which would delegate the actual building to the targets. [20:26:45] Now, anybody can extend it by creating Target classes, without having to create any new commands. [20:27:11] Come to think of it, we could refactor Extension and some of the package stuff to treat those items as Targets. [20:27:21] Targets could have build and install parts to them. [20:27:49] We don't need any of this for eggs, but it might help w/porting Chandler' to the distutils for [20:27:59] chandler's build process to the distutils. [20:31:53] Well, I think I'm going to call it a night. [20:32:00] good idea :) [20:32:12] Before I completely redesign the distutils... [20:32:19] Oops, too late, already did. :) [20:32:22] which is tough *not* to do [20:32:48] Actually, I think the target idea can be used to gently refactor it to allow more extensibility without needing to hack distutils internals. [20:33:15] And gradually refactor its existing targets to just be targets with convenience APIs, but otherwise no different. [20:33:22] Anyway, I'd better actually go now. :) [20:33:24] See ya. [20:33:27] adios :) [20:33:40] ** pje has left IRC ("Client exiting") [20:59:07] ** Purdu3 has joined us [21:57:53] ** sprout has left IRC ("Snak 4.13 IRC For Mac - http://www.snak.com") [22:30:16] ** vlado has joined us [23:47:05] ** vlado has left IRC ("Leaving")