[ksh93-integration-discuss] Re: Some comments/questions about ksh93_20060831...

Roland Mainz roland.mainz at nrubsig.org
Fri Sep 8 13:50:38 PDT 2006


Glenn Fowler wrote:
[CC:'ing Werner Fink (to take a look at the GNU/Linux issue here) and
Ienup Sung (because it's i18n-related)]
> On Sun, 3 Sep 2006 01:34:30 +0200 Roland Mainz wrote:
> > I have a few questions/comments about ksh93_2006083:
> > 1. On SuSE Linux x86 all multibyte characters, even german "öäü" are
> > broken in "gmacs" and "emacs" editing ... ;-((
> > (the code has been compiled with % (LANG=C LC_ALL=C CC="gcc
> > -I${PWD}/arch/linux.i386/src/lib/libast" CCFLAGS="-fPIC
> > -fno-strict-aliasing -g -pipe -Wall -Wno-missing-braces
> > -Wno-unknown-pragmas -Wno-parentheses -Wno-uninitialized
> > -D_map_libc=1" ./bin/package make) 2>&1 | tee -a buildlog.log # ... I
> > hope the "-I${PWD}/arch/linux.i386/src/lib/libast"-hack isn't causing
> > this problem).
> 
> dgk and I looked into this
> the problem is that the gnu/linux <ctype.h> isprint() macro seems to ignore
> 8-bit char locale specifics -- the few we tested printed the same
> isprint() results for 0..255 for LC_ALL=C and LC_ALL=de
> so for the 3 german characters "öäü"
> isprint() returns 0, and this throws ksh into a
> "display unprintable character" mode
> 
> dgk changed the code to use iscntrl() with some more logic to do the
> right thing even when isprint() doesn't

Ok... thanks! :-)
... but note that I hit the problem with LANG=en_US.UTF-8 and
LC_ALL=en_US.UTF-8 - that's a multibyte locale while "de" is usually a
link to "de_DE.ISO8859-1" or "de_DE.ISO8859-15" (which are both
single-byte locales in both Solaris and Linux) ... ;-(

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) roland.mainz at nrubsig.org
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 7950090
 (;O/ \/ \O;)



More information about the ksh93-integration-discuss mailing list