[ksh93-integration-discuss] Re: Some comments/questions about ksh93_20060831...

Roland Mainz roland.mainz at nrubsig.org
Tue Sep 12 14:25:07 PDT 2006


Glenn Fowler wrote:
> 
> On Fri, 08 Sep 2006 22:50:38 +0200 Roland Mainz wrote:
> > Glenn Fowler wrote:
> > [CC:'ing Werner Fink (to take a look at the GNU/Linux issue here) and
> > Ienup Sung (because it's i18n-related)]
> > > On Sun, 3 Sep 2006 01:34:30 +0200 Roland Mainz wrote:
> > > > I have a few questions/comments about ksh93_2006083:
> > > > 1. On SuSE Linux x86 all multibyte characters, even german "öäü" are
> > > > broken in "gmacs" and "emacs" editing ... ;-((
> > > > (the code has been compiled with % (LANG=C LC_ALL=C CC="gcc
> > > > -I${PWD}/arch/linux.i386/src/lib/libast" CCFLAGS="-fPIC
> > > > -fno-strict-aliasing -g -pipe -Wall -Wno-missing-braces
> > > > -Wno-unknown-pragmas -Wno-parentheses -Wno-uninitialized
> > > > -D_map_libc=1" ./bin/package make) 2>&1 | tee -a buildlog.log # ... I
> > > > hope the "-I${PWD}/arch/linux.i386/src/lib/libast"-hack isn't causing
> > > > this problem).
> 
> > Ok... thanks! :-)
> > ... but note that I hit the problem with LANG=en_US.UTF-8 and
> > LC_ALL=en_US.UTF-8 - that's a multibyte locale while "de" is usually a
> > link to "de_DE.ISO8859-1" or "de_DE.ISO8859-15" (which are both
> > single-byte locales in both Solaris and Linux) ... ;-(
> 
> this is where I have a disconnect with utf8 vs 8bit vs 7bit locales
> you set LC_ALL to utf8 for a ksh
> you mouse snarf the 8bit german chars above and paste into ksh edit mode
> how does the edit sw know that the snarfed text is 3 8bit chars
> and not some utf8 encoding?

In normal scenarios the user logins in into his graphical environment
(for example CDE where the "xdm" variant is called "dtlogin" - it has a
dropdown where you can select the language+encoding which is then set
for the following login session (in the LANG and LC_* variables)) and
all applications started from there run in the same locale. In that case
pasting any text (including multibyte characters) from one terminal into
another terminal just copies bytes around (this is how we do any testing
- we just login into CDE or KDE with a specific locale set and all
applications just have the same LANG and LC_* variables eset).

A more compliciated scenario occurs when multiple graphical applications
run in multiple locales:
Originally X11 invented the COMPUND_TEXT method which contained the
characers in the native format (locale of the source application) with
hints in which encoding these data are formatted. The destination
application (or better: The toolkit libraries) are then responsible to
convert these data into the locale of the destination application
Unfortunately some of today's toolkits like GTK+/Gnome completely ignore
COMPUND_TEXT (AFAIK because the authors were too dumb (sorry for this
harsh comment but when I read the Xfree86+Gnome archives and the rants
about COMPOUND_TEXT then I really do not understand why it as so
difficult to use COMPUND_TEXT) to understand the concept behind
COMPOUND_TEXT) and "reinvented the wheel" via introducing a new text
property called "UTF8" (where both source and destination application
convert the text from/to UTF8).

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) roland.mainz at nrubsig.org
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 7950090
 (;O/ \/ \O;)



More information about the ksh93-integration-discuss mailing list