交叉编译的ncurses错误显示UTF-8内容

Question

我有一个嵌入式系统，里面有一个 vt52u（unicode）终端模拟器，它可以正确显示 UTF-8 字符。但是我交叉编译的程序（比如 vim 和 python）在这个终端上使用 ncursesw 库时，无法正确显示 unicode 字符。

我搜索过，但网上似乎没有什么能帮上忙的……

Bash shell 设置了这些变量：

LANG=en_US.UTF-8
LD_LIBRARY_PATH=/mnt/sd/lib  # all libraries, cross compiled, are located here
TERM=vt52u
TERMCAP='vt52u|vt52 with UTF-8:am:eo:rs=\Ee\Eb0\Eco:is=\EE\Ee:nl=^j:sr=\Ei:bl=^g:ta=^i:ho=\EH:cr=^m:le=\ED:nd=\EC:do=\EB:up=\EA:ta=^i:nw=^m:xn:cm=\EY%+ %+ :it#8:co#75:li#24:sc=\Ej:rc=\Ek:vi=\Ef:ve=\Ee:so=\Eb0\Ec4:se=\Eb0\Eco:mh=\Eb8\Eco:mr=\Ebo\Ec0:me=\Eb0\Eco:cl=\EH\EJ:cd=\EJ:ce=\EK:km:ku=^p:kd=^n:kr=^f:kl=^b:kb=^h:'
LOCALE=

我知道这个 unicode 终端没问题，因为我可以运行以下程序，让它完美显示 UTF-8 字符。

#!/bin/env python
#coding=UTF-8

charset = [
0x2205, 0x2629, 0x00B2, 0x2663, 0x2666, 0x00B1, 0x221A, 0x266B,
0x2190, 0x2524, 0x2500, 0x2534, 0x253C, 0x251C, 0x2193, 0x2191,
0x00B0, 0x2665, 0x00AE, 0x2660, 0x00B7, 0x00A4, 0x00A4, 0x00A4,
0x00D7, 0x00B5, 0x2126, 0x252C, 0x250C, 0x2510, 0x2514, 0x2518
]

for i in charset:
    print unichr( i ).encode("UTF-8"),
    if i == 0x2191: print
print


bash-4.3#./testunicode
bash-4.3# ∅ ☩ ² ♣ ♦ ± √ ♫ ← ┤ ─ ┴ ┼ ├ ↓ ↑
bash-4.3# ° ♥ ® ♠ · ¤ ¤ ¤ × µ Ω ┬ ┌ ┐ └ ┘

但是当我尝试在 python 程序中使用 curses（ncursesw）显示框线字符，或者用 vim 编辑一个 UTF-8 文本文件（它链接到 ncursesw）时，屏幕上却出现了无效的 unicode 方框，后面跟着像 ~T~B、~T~L 这样的额外符号，以及其他许多变体。所以，不知道为什么 ncursesw 输出了无效的 UTF-8 字符……

我知道链接是正确的，并且没有冲突的 ncurses 库；只有 ncursesw 存在。

我这样配置了 ncurses-5.9 库以进行编译：

./configure --prefix=/mnt/sd --without-cxx --without-cxx-binding --without-ada --without-manpages --without-progs --without-tests --without-curses-h --with-build-cc=gcc --with-shared --without-normal --without-debug --without-profile --without-gpm --without-dlsym --without-sysmouse --build=i686-linux --host=arm-linux-gnueabi --without-pthread --enable-widec --with-fallbacks=vt52 --with-terminfo-dirs=/etc/terminfo --disable-big-core --enable-termcap --with-termpath=/Data/termcap

举个例子：以下程序在我的桌面 Linux ~VT102 上运行良好，但在嵌入式系统的 vt52u 上运行时却出现故障：

#!/bin/env python
# coding=UTF-8
board=[
"","",
u"   ┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐",
u"   │   │   │   │   │   │   │   │   │   │   │   │   │   │   │   │   │",
u"   │   │ ├───→───→───→ │   │   │   │   │   │ ├───→───→───→ │   │   │" ]

board = ( i.encode("utf-8") for i in board )

import curses
import locale
locale.setlocale( locale.LC_ALL,"" )

screen = curses.initscr()
curses.noecho() # no echoing
curses.cbreak() # no keyboard buffering

# Setup the playing board screen. 
for i,s in enumerate( board ): screen.addstr( i,0,s )
screen.refresh()

import time
time.sleep(2)

# Return everything to normal
curses.echo()
curses.nocbreak()
curses.endwin()

编辑：我发现了一个可能导致这个错误的原因——glibc 在交叉编译时，没有正确的路径指向 /mnt/sd/share/i18n/charmap 或 /mnt/sd/share/i18n/locales 目录。它只有部分路径，Make 或 configure 前面加了一个错误的 '/'，而不是目标系统的实际根路径 /mnt/sd。当找不到文件时，glibc 显然会默认回到 "C" 或 "POSIX" 区域设置，并忽略环境变量。

我不确定是否需要重新编译 glibc 并手动编辑 makefile，或者是否有办法在系统构建后手动设置路径……?? 有什么想法吗？

字符编码 utf-8 嵌入式系统终端模拟器 ncurses 交叉编译语言环境 glibc

交叉编译的ncurses错误显示UTF-8内容

1 个回答

撰写回答