[e-lang] Module naming and identification
Lex Spoon
lex at lexspoon.org
Mon Apr 13 12:37:17 EDT 2009
On Apr 12, 2009, at 1:30 PM, Kevin Reid wrote:
> On Apr 9, 2009, at 20:40, ihab.awad at gmail.com wrote:
>> Kris Kowal and I have been thinking about modules in JavaScript a
>> lot. :) Our current proposal (which should be unsurprising to this
>> list) contains a loader which turns module identifiers into maker
>> functions, which can then be instantiated with some capabilities.
>> (For the current stage of our proposal, it is wrapped in a
>> "require()" function that uses a "sandbox" abstraction, but this
>> fact may be safely ignored because it is just a thin layer on top of
>> maker functions.)
>>
>> So far, we have left somewhat open the question of the syntax and
>> semantics of module identifiers. This document, with lots of help
>> from Kris and MarkM, is an attempt to address this question a bit:
>>
>> http://docs.google.com/Doc?id=dfgxb7gk_61d47876fr
>>
>> Comments welcome as always.
>
> - The notion of naming modules with URIs is good. It would be better
> if there was a standardized URI-which-designates-the-data-with-this-
> hash -- this would fulfill the use case of "avoid code duplication,
> but don't specifically rely on anybody else's server or have to change
> the identifier to point at my server".
It's a really good issue to bring up, Ihab. Some thought about this
seemingly minor issue could make a big difference in how easy it is to
write and use modules in the system. I'd like to elaborate on some
design principles you seem to be going from in the proposal, but that
are implicitly under attack in the current thread.
First of all, it's important to distinguish binary distributions sent
to the user, typically via a web browser, from source code, which is
the stuff edited by humans and checked into repositories. I'm going
to focus on source code, as you seemed to have. The binary version is
important, though, especially for web deployment.
For source code, here are some constraints:
1. From general principles, modules should have as light of
dependencies as possible. Specifying a version number for a depended
on module is a heavier dependency than a module writer really needs or
intends. Specifying a hash is even further overkill in that
direction. These are so constraining that most likely developers will
want some other way to specify the dependency and then have a tool
fill in the specific version or hash. It would be better to design
that system right now. (All this said, hash-based dependencies look
like an important part of web-based binary distribution; once the app
starts up, you want to freeze most if not all the versions of the
depended-on modules.)
2. At some point all the modules must be linked together to make a
whole program. I think of a "distribution" being a key input to this
step, where a distribution could be anything from a set of libraries
checked into a lib directory up to a 1000-maintainer extravaganza such
as Debian. Notable for the present purposes is that the same module
version, bit for bit, will be checked into different distributions.
Further, there are likely different kinds of distributions that people
will want to use in practice. The selection of module versions,
therefore, needs to be adaptable to a variety of kinds of
distributions. The design doc talks about "compatibility" and "better
versions", which looks like the same idea.
3. In a larger program, the same module will end up being loaded via
different dependency paths. A key decision is whether to allow
multiple versions of the module to load. Both actually look important
in practice, but if you have to pick one, it looks like the "just one
version" version is the better default. This is because there are
cases where the objects created in one dependency path end up being
passed to the module on the other dependency path; they had better be
the same actual module if this is going to work. Further, there are
ways to fake the multiple-versions version, but not vice versa.
Based on these constraints, it looks like the path-like identifiers in
the design doc are really good. To further develop this design, it
looks worthwhile to think about what the "linker" step will look like
for different kinds of distributions. That is, if a module depends on
a/b/c, how does the linker decide which version of c to satisfy this
with? This could be done with a linker script written in JavaScript,
or it could be done by actually laying out the file system with the
chosen versions of everything, or it could be done (dare to dream) by
checking out code from a master JavaScript world-wide JS
distribution. I suspect all such linkers will be so straightforward
everyone will wonder what the fuss was about with module identifiers;
I hope that's true, and there's no better way to prove it than to
actually sit down and sketch out a few typical schemes.
On a brief digression, I must lament that there don't seem to be any
JavaScript distributions that focus on compatibility rather than
completeness. Am I overlooking any? Linux distributions have drifted
toward having distros of compatible components, but distributions for
languages (Perl, TeX, Ruby, etc.) seem to all favor completeness. I
believe such a distro for languages would make sense and would greatly
help people working within that language, much the way having Linux
distros makes Linux much better for Linux users.
Lex
More information about the e-lang
mailing list