At 06:43 PM 1/24/00 , Mark S. Miller wrote:
>Bug 6: Too Much Yaccing
>
>To be covered in the next message. This bug is currently fatal for the use
>of Javasoft's Java-2. The only known work-around is to use a Java-1.1.
>More later.
The problem starts with BYacc/J (Berkeley Yacc for Java) generating a parser in Java whose table size exceeds Java's limits on legal method size. It seems a Java-source data initialization like
static private final short[] yytable = { ... };
gets compiled into a method(!!) whose bytecodes initialize this table. This is a problem that had been "fixed" before. Once upon a time, Marc Stiegler wrote an E program to take the yacc output and break it into several files. As tweaked by me, these were two new classes, each of which only contained one initialized data element. I also turned the transforming into an ELib program (named EYaccFixer) to avoid a cyclic build dependency.
Javasoft's Java-2 seems to have gotten stricter at enforcing the spec, or perhaps they've changed the spec. In any case, it seems we were still violating the spec, in that each individual data initialization was still too large, so it's good we were forced to fix it.
Status: What we now do (in the unreleased next version) is 1) EYaccFixer parses the data initializations it extracts from byacc/j's output, turns them into arrays of shorts, and serializes them to a file we include in e.jar as a resource. 2) The transformed parser initializes these arrays by reading the resource and unserializing these arrays from the resource-as-stream.
Btw, after several folks were so enthused about using Antlr instead, it's really a shame to be going out of our way at this point to maintain the yacc-generated parser. If this inspires anyone to have a go at it, now would be a great time, as the Antlr folks have just announced a new improved v2.7.0. I would be ready to start integrating the result back in after FC'00, so let's say the beginning of March. Any takers?
Bug 7: Perls Before Twine
After fixing the above problem, I figured everything would finally just work under Java-2. After all, E used to work under Java-2 -- I remember testing it. Apparently this was before we integrated in the Perl5 regular expression engine from OROMatcher. Turns out the version we grabbed back then wasn't Java-2 compatible. No problem. They've posted a new version which is. I grabbed it, plugged it in, and the problem went away. If only they were all like this.
Btw, this is another case of
it's-a-shame-to-be-maintaining-the-wrong-solution. The right solution is
the gnu regex package. It would be great if someone rebuilt our Perl5
quasi-parser out of the gnu regex package. March, again, will be a good
time for me to reintegrate this. Any takers?
Cheers,
--MarkM