Discussion:
[Bug localedata/23502] New: gconv(UTF-8 to GB18030)
286000435 at qq dot com
2018-08-10 04:56:27 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23502

Bug ID: 23502
Summary: gconv(UTF-8 to GB18030)
Product: glibc
Version: 2.17
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: localedata
Assignee: unassigned at sourceware dot org
Reporter: 286000435 at qq dot com
CC: libc-locales at sourceware dot org
Target Milestone: ---

A bug of gconv(UTF-8 to GB18030).
The version of glibc is greater than 2.17(>=2.17).
Following code can't get the right result.

#include <iconv.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void print_buf(char* buf, size_t len)
{
size_t i;
for (i = 0; i < len; ++i) {
printf("%02X ", (unsigned char)buf[i]);
}
printf("\n");
}

int main(int argc, char** argv)
{
char s[] = "早見純";
iconv_t cd = iconv_open("GB18030", "UTF-8");
if (cd <= 0 ) {
printf("iconv_open fail\n");
return 0;
}
size_t inbytesleft = sizeof(s) - 1;
char dst[6 * sizeof(s)] = { 0 };
size_t outbytesleft = sizeof(dst) - 1;
char* inbuf = s;
char* outbuf = dst;

if (iconv(cd, &inbuf, &inbytesleft, &outbuf, &outbytesleft) ==
(size_t)-1) {
printf("iconv fail: %d\n", errno);
}
iconv_close(cd);
size_t n = strlen(dst);
printf("inbytesleft: %u, outbytesleft: %u, dst len: %u\n", inbytesleft,
outbytesleft, n);
print_buf(dst, n);
return 0;
}


get error output:
iconv fail: 84
inbytesleft: 4, outbytesleft: 72, dst len: 6
D4 E7 D2 8A BC 83


But I can get the right result with gblic 2.12:
inbytesleft: 0, outbytesleft: 69, dst len: 8
D4 E7 D2 8A BC 83 FE 52
--
You are receiving this mail because:
You are on the CC list for the bug.
286000435 at qq dot com
2018-08-10 04:59:03 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23502

lance <286000435 at qq dot com> changed:

What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |critical
--
You are receiving this mail because:
You are on the CC list for the bug.
286000435 at qq dot com
2018-08-10 05:08:27 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23502

lance <286000435 at qq dot com> changed:

What |Removed |Added
----------------------------------------------------------------------------
Summary|gconv(UTF-8 to GB18030) |gconv(GB18030 to UTF-8)
--
You are receiving this mail because:
You are on the CC list for the bug.
286000435 at qq dot com
2018-08-10 10:39:15 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23502

lance <286000435 at qq dot com> changed:

What |Removed |Added
----------------------------------------------------------------------------
Summary|gconv(GB18030 to UTF-8) |gconv(UTF-8 to GB18030)
--
You are receiving this mail because:
You are on the CC list for the bug.
286000435 at qq dot com
2018-08-10 10:39:50 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23502

lance <286000435 at qq dot com> changed:

What |Removed |Added
----------------------------------------------------------------------------
Summary|gconv(UTF-8 to GB18030) |gconv(GB18030 to UTF-8)
--
You are receiving this mail because:
You are on the CC list for the bug.
286000435 at qq dot com
2018-08-10 05:12:38 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23502

--- Comment #1 from lance <286000435 at qq dot com> ---
A bug of gconv(GB18030 to UTF-8).
--
You are receiving this mail because:
You are on the CC list for the bug.
schwab@linux-m68k.org
2018-08-10 12:04:01 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23502

Andreas Schwab <***@linux-m68k.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Severity|critical |normal
--
You are receiving this mail because:
You are on the CC list for the bug.
schwab@linux-m68k.org
2018-09-05 14:04:25 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23502

Andreas Schwab <***@linux-m68k.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |WAITING
Last reconfirmed| |2018-09-05
Ever confirmed|0 |1

--- Comment #2 from Andreas Schwab <***@linux-m68k.org> ---
What is the exact contents of the array s?
--
You are receiving this mail because:
You are on the CC list for the bug.
286000435 at qq dot com
2018-09-12 08:55:06 UTC
Permalink
https://sourceware.org/bugzilla/show_bug.cgi?id=23502

--- Comment #3 from lance <286000435 at qq dot com> ---
(In reply to Andreas Schwab from comment #2)
Post by ***@linux-m68k.org
What is the exact contents of the array s?
This is new code:

#include <iconv.h>
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void print_buf(char* buf, size_t len)
{
size_t i;
for (i = 0; i < len; ++i) {
printf("%02X ", (unsigned char)buf[i]);
}
printf("\n");
}

int main(int argc, char** argv)
{
char s[] = "早見純";
print_buf(s, sizeof(s));
iconv_t cd = iconv_open("GB18030", "UTF-8");
if (cd <= 0 ) {
printf("iconv_open fail\n");
return 0;
}
size_t inbytesleft = sizeof(s) - 1;
char dst[6 * sizeof(s)] = { 0 };
size_t outbytesleft = sizeof(dst) - 1;
char* inbuf = s;
char* outbuf = dst;

if (iconv(cd, &inbuf, &inbytesleft, &outbuf, &outbytesleft) ==
(size_t)-1) {
printf("iconv fail: %d\n", errno);
}
iconv_close(cd);
size_t n = strlen(dst);
printf("inbytesleft: %u, outbytesleft: %u, dst len: %u\n", inbytesleft,
outbytesleft, n);
print_buf(dst, n);
return 0;
}

array s contents is :
E6 97 A9 E8 A6 8B E7 B4 94 EE A0 97 00
--
You are receiving this mail because:
You are on the CC list for the bug.
Loading...