Re: abiword in german?

From: <melodramus_at_online.de>
Date: Mon May 17 2010 - 06:38:17 CEST

On Mon, 17 May 2010 08:36:41 +1000
Martin Sevior <msevior@gmail.com> wrote:

> HI melodramus,
>
> Sorry I don't understand your bug report. Our code to decypher the
> various levels of environment variables is in:
>
> src/wp/ap/gtk/ap_UnixPrefs.cpp::::overlayEnvironmentPrefs(void)
>
> Can you suggest a patch that would get thing right for you?
>
> Cheers
>
> Martin

sorry twice, for you not understanding my bug report and for me not
understanding the code in the mentioned file. it seems to be a hacked
interim solution. however, i found the error:

on my system only LC_ALL is set. but abiword checks - per #ifdef - for
LC_MESSAGES in the environment, can't find it, drops the call setlocale
(LC_MESSAGES, NULL) from the code, tries getenv(LANG) instead, which is
also not successful, and falls back to en_US. there we have it! if
abiword had called setlocale(LC_MESSAGES, NULL) at runtime, it would
have gotten "de_DE.UTF-8", which works just fine!

critique on the code, as far as i understood it:

    if (!m_bUseEnvLocale)
        return; // nothing to do...

i did not track this down and am not sure what this is for (windows?)

#if 1
(???)

    // TODO use various POSIX env variables
    // TODO (such as LANG and LC_*) to compute
    // TODO a name in our locale namespace
    // TODO (see .../src/wp/ap/xp/ap_*_Languages.h)
    (...)

    // make a copy of the current locale so we can set it back
    char *old_locale = g_strdup(setlocale(LC_ALL, NULL));

as far as i see you store the default 'C' locale for whatever
purpose.

    // this will set our current locale information
    // according to the user's env variables
    setlocale(LC_ALL, "");
    (...)

now you set the environment properly but don't save it. i'd expect the
above step after this step here.

    // now, which of the categories should we use?
    // we used to use LC_CTYPE, but decided that LC_MESSAGES
    // was a better idea (most likely, all of LC_* are the same)

depends, what for? all the LC_* are used for different purposes. don't
know why you need to decide between them here. see:

http://www.gnu.org/software/libc/manual/html_node/Locale-Categories.html

        
    const char * szNewLang = "en-US"; // default to US English
#if defined (LC_MESSAGES) && defined (UNDEF) // raphael
// #if defined (LC_MESSAGES)
    char * lc_ctype = g_strdup(setlocale(LC_MESSAGES, NULL));

why this is #if'ed and thus bound at the environment at compile
time? i don't understand the special purpose of this strategy. there
is an easy way how to check at runtime: if only some categories are set,
the rest is set to 'C' (or possibly NULL) by default. this is a good
hint that the system is badly configured. my tip is: check if a
category is set to 'C' or NULL. if so, fall back to en_US and provide a
warning dialog. don't fiddle with a bad setting. if 'C' was chosen
intentionally, falling back to en_US does no harm. but accepting 'C'
right away is a bad choice because the user may not recognize the
problem.

i see a misunderstanding of what happens with the environment values.
the logics behind are straight forward. LANG is legacy and only there
for compatibility and as a quick fallback to fill all categories with a
default before a selected amount of categories is overwritten with LC_*
variables. this is all done before the app uses setlocale() the first
time and is of no interest to the app. the variable LANG is of no
interest to the app. forget about it.

LC_ALL comes first in hierarchy and overrides all others. only
if not set the rest of LC_* is considered. if some categories are
left unset, as said, LANG is considered. if LANG is also not set,
the default locale 'C' is chosen (or possibly NULL on weird systems)
for all unset categories. that's it.

there is only one more thing to consider: if there is a mixed
configuration, setlocale(LC_ALL, NULL) throws back a string containing
*all* rules separated by semicolons. only use that for re-setting the
default locale.

#else
        char * lc_ctype = getenv("LANG");
        if (lc_ctype) lc_ctype = g_strdup(lc_ctype);

as said, drop this completely!

        else lc_ctype = g_strdup("en_US");

why lc_ctype if you first tried to assign LC_MESSAGES to it? decide
what you want! LC_CTYPE is for string manipulations, LC_MESSAGES is for
displayed messages. what you want? the code is really hard to grasp. is
it possibly out to be rewritten???

#endif
        // locale categories seem to always look like this:
        // two letter for language (lowcase) _ two letter country code
(upcase) // ie. en_US, es_ES, pt_PT
        // which goes to the Abiword format:
        // en-US, es-ES, pt-PT
    (...)

also, the locale names may have extensions, like on my system, if
different charsets are supported. i know about two styles:

de_DE.UTF-8 // on my system, chosen from the glibc README
de_DE@UTF-8 // seems to be more common

from the code i see that at least one style is known. but i don't
understand why you manipulate the locale string. is there a need for
this? better not fiddle with that at all. just put what is provided
into other calls and see how they react. they should react just fine
because they know better how to treat locale strings.

i'm not a c++ programmer and don't want to struggle more with abiword.
sorry! but my tip is that you rewrite it.

MeloDramus <melodramus@online.de>
-----------------------------------------------
To unsubscribe from this list, send a message to
abiword-user-request@abisource.com with the word
unsubscribe in the message body.
Received on Mon May 17 06:38:15 2010

This archive was generated by hypermail 2.1.8 : Mon May 17 2010 - 06:38:15 CEST