Belated Bug Report & Re: OROMatcher News

Mark S. Miller markm@caplet.com
Thu, 01 Jun 2000 15:32:48 -0700


At 12:08 PM 5/23/00 , Dan Bornstein wrote:
>[...] Salient excerpt:
>
>     ORO Java Software 
>
>     In June, OROMatcher 2.0, [...]
>     The software will be released under the Apache license, which is a
>     BSD-style license [...]

Thanks.  This looks good.  Before, we've discussed switching to gnu.regex, 
since it is open-source and OROMatcher wasn't.  But now, since

1) we're already using OROMatcher, 
2) this license is much less restrictive than the LGPL that covers the 
     gnu.regex competition, 
3) incorporation into Jakarta means OROMatcher should remain well supported, 
4) source availability will allow us to add the feature to OROMatcher that 
     we need to fix a bug in the rx quasi-parser,

I'd say we should just stick with OROMatcher.  Are there any compelling 
arguments for switching to gnu.regex that I'm missing?

Note: Since the source isn't yet available, we can't yet fix the bug 
mentioned in #4 above.  The bug?  The rx quasi-parser must be able to tell 
exactly which open-parens in the pattern start a counted nested pattern.  
Given all the escapes and parse-conditioning flags in the Perl5 regex 
language, we'd either have to redo an accurate Perl5 regex parser from the 
spec, which is too hard, or obtain this information from the parser we're 
wrapping, guaranteeing that we're not out-of-sync.  The latter should be 
trivial, but the existing OROMatcher interface denies us that information.

In the meantime, we are using some simpler rules that work for almost all 
regular expressions that people actually write.

(I thought I'd already posted a bug report on this, but I just searched the 
archive & couldn't find it.)

I'd sent David Sevarese (the ORO open source coordinator) email about this
ages ago, and received no response.  Shortly, we'll be able to just fix it 
ourselves, and submit the change for incorporation into the main line.


         Cheers,
         --MarkM