[ksh93-integration-discuss] [osol-code] extremly badperformance of Solaris regex
Roland Mainz
roland.mainz at nrubsig.org
Wed Oct 31 18:56:42 PDT 2007
Jens Elkner wrote:
[Please keep ksh93-integration-discuss at opensolaris.org in the CC: field
for AST-related topics since not all people there are subscribed to all
other opensolaris-org lists, too]
> On Wed, Oct 31, 2007 at 01:04:55PM -0400, Glenn Fowler wrote:
> > > Jens Elkner wrote:
> > > > regex.c RegexTest.java MHz
> > > > Solaris sparc 13.12u 0.01s 0:13.17 99.6% 5569 ms 1503
> > > > Solaris x86 6.28u 0.00s 0:06.29 99.8% 2571 ms 2813
> > > > Linux i686 0.704u 0.004s 0:00.70 100.0% 5587 ms 2079.593
> >
> > I found RegexTest.java but it takes as input a file of patterns/string "regex.txt"
>
> Ahh - good.
>
> > and it looks like that file is user-specified
>
> Yes - the RE is hardcoded and as file (first arg to each prog) one may
> choose any file [which yields the worst case aka no match]. For the
> numbers above I've choosen an email of my SUN contact, which contains
> matching grant infos wrt. FUL aka EDU - so I guess not really public stuff.
>
> elkner.q ~/tmp > wc xt
> 6617 6883 427011 xt
>
> Anyway, produced the numbers again for those, which wanna fetch
> http://iws.cs.uni-magdeburg.de/~elkner/regex/mk.log
>
> elkner.q ~/tmp > wc /tmp/mk.log
> 2633 27847 325535 /tmp/mk.log
>
> regex.c RegexTest.java MHz
> Solaris sparc 12.14u 0.00s 0:12.25 99.1% 5324 ms 1503
> Solaris x86 5.85u 0.00s 0:05.86 99.8% 2190 ms 2813
> Linux 0.676u 0.020s 0:00.69 100.0% 4447 ms 2079.593
Mhhh... it would be nice to get seperate numbers for LC_ALL=C and
LC_ALL=en_US.UTF-8 ... and maybe for LC_ALL=zh_CN.GB18030 , too (since
some GNU stuff has hardcoded hacks for UTF-8 to speed-up things (while
ignoring anything related to "standards" ... ;-( ))
----
Bye,
Roland
--
__ . . __
(o.\ \/ /.o) roland.mainz at nrubsig.org
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 7950090
(;O/ \/ \O;)
More information about the ksh93-integration-discuss
mailing list