[e-lang] Module naming and identification

Lex Spoon lex at lexspoon.org
Mon Apr 13 12:37:17 EDT 2009


On Apr 12, 2009, at 1:30 PM, Kevin Reid wrote:
> On Apr 9, 2009, at 20:40, ihab.awad at gmail.com wrote:
>> Kris Kowal and I have been thinking about modules in JavaScript a
>> lot. :) Our current proposal (which should be unsurprising to this
>> list) contains a loader which turns module identifiers into maker
>> functions, which can then be instantiated with some capabilities.
>> (For the current stage of our proposal, it is wrapped in a
>> "require()" function that uses a "sandbox" abstraction, but this
>> fact may be safely ignored because it is just a thin layer on top of
>> maker functions.)
>>
>> So far, we have left somewhat open the question of the syntax and
>> semantics of module identifiers. This document, with lots of help
>> from Kris and MarkM, is an attempt to address this question a bit:
>>
>>  http://docs.google.com/Doc?id=dfgxb7gk_61d47876fr
>>
>> Comments welcome as always.
>
> - The notion of naming modules with URIs is good. It would be better
> if there was a standardized URI-which-designates-the-data-with-this-
> hash -- this would fulfill the use case of "avoid code duplication,
> but don't specifically rely on anybody else's server or have to change
> the identifier to point at my server".

It's a really good issue to bring up, Ihab.  Some thought about this  
seemingly minor issue could make a big difference in how easy it is to  
write and use modules in the system.  I'd like to elaborate on some  
design principles you seem to be going from in the proposal, but that  
are implicitly under attack in the current thread.

First of all, it's important to distinguish binary distributions sent  
to the user, typically via a web browser, from source code, which is  
the stuff edited by humans and checked into repositories.  I'm going  
to focus on source code, as you seemed to have.  The binary version is  
important, though, especially for web deployment.

For source code, here are some constraints:

1. From general principles, modules should have as light of  
dependencies as possible.  Specifying a version number for a depended  
on module is a heavier dependency than a module writer really needs or  
intends.  Specifying a hash is even further overkill in that  
direction.  These are so constraining that most likely developers will  
want some other way to specify the dependency and then have a tool  
fill in the specific version or hash.  It would be better to design  
that system right now.  (All this said, hash-based dependencies look  
like an important part of web-based binary distribution; once the app  
starts up, you want to freeze most if not all the versions of the  
depended-on modules.)

2. At some point all the modules must be linked together to make a  
whole program.  I think of a "distribution" being a key input to this  
step, where a distribution could be anything from a set of libraries  
checked into a lib directory up to a 1000-maintainer extravaganza such  
as Debian.  Notable for the present purposes is that the same module  
version, bit for bit, will be checked into different distributions.   
Further, there are likely different kinds of distributions that people  
will want to use in practice.  The selection of module versions,  
therefore, needs to be adaptable to a variety of kinds of  
distributions.  The design doc talks about "compatibility" and "better  
versions", which looks like the same idea.

3. In a larger program, the same module will end up being loaded via  
different dependency paths.  A key decision is whether to allow  
multiple versions of the module to load.  Both actually look important  
in practice, but if you have to pick one, it looks like the "just one  
version" version is the better default.  This is because there are  
cases where the objects created in one dependency path end up being  
passed to the module on the other dependency path; they had better be  
the same actual module if this is going to work.  Further, there are  
ways to fake the multiple-versions version, but not vice versa.

Based on these constraints, it looks like the path-like identifiers in  
the design doc are really good.  To further develop this design, it  
looks worthwhile to think about what the "linker" step will look like  
for different kinds of distributions. That is, if a module depends on  
a/b/c, how does the linker decide which version of c to satisfy this  
with?  This could be done with a linker script written in JavaScript,  
or it could be done by actually laying out the file system with the  
chosen versions of everything, or it could be done (dare to dream) by  
checking out code from a master JavaScript world-wide JS  
distribution.  I suspect all such linkers will be so straightforward  
everyone will wonder what the fuss was about with module identifiers;  
I hope that's true, and there's no better way to prove it than to  
actually sit down and sketch out a few typical schemes.

On a brief digression, I must lament that there don't seem to be any  
JavaScript distributions that focus on compatibility rather than  
completeness.  Am I overlooking any?  Linux distributions have drifted  
toward having distros of compatible components, but distributions for  
languages (Perl, TeX, Ruby, etc.) seem to all favor completeness.  I  
believe such a distro for languages would make sense and would greatly  
help people working within that language, much the way having Linux  
distros makes Linux much better for Linux users.

Lex



More information about the e-lang mailing list