After discovering my initial approach to PEP-402 style imports wasn't very robust, I implemented a better way of going about things but it later turned out that the recursive approach which importlib.__import__() employs is not well suited for virtual packages. Because of that, I ended up integrating P.J. Eby's  (who is the author of PEP 402) iterative _gcd_import(). After a few minor changes it passes all unittests for importlib, but some details of virtual package imports still need to be worked out in the abstract before they can be implemented. In particular, it's not entirely clear if it's better to let virtual packages hang around should import of a module fail or should they be removed.
GSoC ImportEngine for Python blog
Sunday, August 14, 2011
Tuesday, August 9, 2011
PEP 402
I realised I didn't say much about PEP 402 last time round so here's some more details:
PEP 402 aims to address the long standing problem with storing contents of a Python package in several directories. A previous proposal to fix this issue was PEP 382 which proposed extensions to the existing *.pth mechanism available on the top-level python path. PEP 402, however, describes a simplified solution where the requirement for directories with packages to contain an __init__.py file is lifted in some cases:
I spent the last few days working on a proof-of-concept implementation. The changes outlined in the PEP are quite small but identifying the correct places to inject new code was quite tricky. I had to read the importlib code in much more detail to identify the places where to make the changes but eventually I got the code to support simple use cases. I created a separate repository for this part of the project: https://bitbucket.org/jergosh/pep-402
Here's the commit with the initial implementation: https://bitbucket.org/jergosh/pep-402/changeset/2c60dc2d17f1
PEP 402 aims to address the long standing problem with storing contents of a Python package in several directories. A previous proposal to fix this issue was PEP 382 which proposed extensions to the existing *.pth mechanism available on the top-level python path. PEP 402, however, describes a simplified solution where the requirement for directories with packages to contain an __init__.py file is lifted in some cases:
- when importing submodules from a directory which contains Python modules. E. g. if there's a directory Foo containing Bar.py present on sys.path, one can import Foo.Bar, just as if Foo also contained __init__.py but not import Foo alone.
- when importing *submodules* from a directory with the same name as an already imported package, e. g. a standard library package.
I spent the last few days working on a proof-of-concept implementation. The changes outlined in the PEP are quite small but identifying the correct places to inject new code was quite tricky. I had to read the importlib code in much more detail to identify the places where to make the changes but eventually I got the code to support simple use cases. I created a separate repository for this part of the project: https://bitbucket.org/jergosh/pep-402
Here's the commit with the initial implementation: https://bitbucket.org/jergosh/pep-402/changeset/2c60dc2d17f1
Sunday, July 31, 2011
The work on the PEP continues and, following Eric Snow's helpful suggestions, I posted it on the Import-SIG mailing list to get some more feedback.
Otherwise, I will be looking into implementing PEP 402 "Simplified Package Layout and Partitioning" during the remaining part of the program. This PEP describes functionality related to dividing packages into separately installed components, aka "namespace packages".
Thursday, July 21, 2011
Post-mid-term
This is just a quick note to say that my midterm evaluation was successful and I'm in fact slightly ahead of schedule. The remaining aims for the end of the program are:
- Controlling which modules can be loaded by white- and blacklisting
- Ensuring the ImportEngine class can be conveniently subclassed to extend/modify its behaviour
- Final testing and documentation (including the PEP)
Monday, July 11, 2011
More progress
Following finishing working on the code for now, I moved on to documentation. I started with drafting a short PEP which describes the proposed changes (I've actually had it written for a few days but I wanted to incorporate Nick's comments before making it public): Import Engine PEP XXX.
With the mid-term evaluation imminent, I revisited my proposal and made sure all the deliverables are in place. I ended up doing things in a different order, mainly because at the time of writing I didn't quite understand how things are organised in the code but it seems I got everything to work. The PEP draft and also some misc functionality are a nice bonus on top of that. This means during the second part of GSoC (assuming my evaluation goes smoothly), I should have time to make my code integrate well with importlib so that there's no code duplication etc. and also polish the PEP, as well as implement remaining features.
Monday, July 4, 2011
Progress!
After a few distracting days I had several very productive ones and now all the features projected for the mid-term evaluation are in place (yippee!), leaving me enough time for documenting the work that has been done. This includes drafting an update to PEP 302, proposing including the import engine functionality into the Python distribution. As I have no experience in writing this PEPs, in the past few days I've been reading different PEPs to get a feeling what style they're written in and started drafting my own. The process is unfortunately likely to take a few more days since I'm much slower to write English than Python ;)
Saturday, June 25, 2011
In the past week or so I finished implementing the core functionality of my project, importing modules using isolated state. Loaders and importers now accept an optional engine parameter. If it's not supplied, they fall back on a GlobalImportEngine instance which uses global state.
I also discovered that some loaders (namely for builtin and extension modules) call functions from the implementation of import in C (imp.*) and those are hardcoded to inject modules into sys.modules. This is a problem, but thanks to limitations imposed on these kind of modules (one copy of the module per process), it should be fine to place them into modules dictionary in an ImportEngine instance after they are imported by the imp module.
Finally, I realised that the test structure is more complex because the global import state needs to be preserved. So there was a number of context managers and other tricks which I didn't change before but I think now I got it right.
As a side note, I had some personal trouble in the last few days but things should be back to normal now and I'll try to post more often here.
Subscribe to:
Comments (Atom)
