vb.net - Converting char from CP437 encoding to UTF-8 encoding always yields the same character code, thus not the same character -
problem
i'm trying convert character and/or byte array cp437 encoding utf-8 (encoding.utf8
). problem no matter try code yields same character code, since 2 encodings have different set of characters mapped character codes resulting char not same.
as example i'm trying convert character char code 3 cp437 (a heart: ♥
) utf-8, , still want same character. when converting utf-8 still uses char code 3 results in control character called etx (see utf-8's codepage layout list of characters).
my attempts
here of attempts:
(general code)
public shared readonly cp437 encoding = encoding.getencoding("ibm437") public shared readonly bytestoconvert byte() = new byte(3 - 1) {3, 4, 5} 'characters: ♥, ♦, ♣. public sub debugencodedarray(byval bytes byte(), byval encoding encoding) dim resultingstring string = encoding.getstring(bytes) messagebox.show( _ string.format("encoding: {1}{0}" & _ "string: ""{2}""{0}" & _ "bytes: {{{3}}}{0}", _ environment.newline, _ encoding.encodingname, _ resultingstring, _ string.join(", ", bytes)), _ "debug", messageboxbuttons.ok, messageboxicon.information _ ) end sub
using encoding.convert()
:
dim convertedbytes byte() = encoding.convert(cp437, encoding.utf8, bytestoconvert) debugencodedarray(convertedbytes, encoding.utf8)
using streamwriter
, writing memorystream
specific encoding:
using mstream new memorystream(16) using writer new streamwriter(mstream, cp437) writer.write(cp437.getchars(bytestoconvert)) end using dim utf8bytes byte() = encoding.convert(cp437, encoding.utf8, mstream.toarray()) debugencodedarray(utf8bytes, encoding.utf8) end using
writing file, reading , convert bytes (not optimal need code for):
file.writealltext("c:\users\vincent\desktop\test.txt", cp437.getstring(bytestoconvert), cp437) dim filebytes byte() = file.readallbytes("c:\users\vincent\desktop\test.txt") dim utf8bytes byte() = encoding.convert(cp437, encoding.utf8, filebytes) debugencodedarray(utf8bytes, encoding.utf8)
results
all above attempts give same result:
and if pass cp437
debugencodedarray()
instead of encoding.utf8
:
expected result
the result expecting is:
dim utf8bytes byte() = encoding.utf8.getbytes("♥♦♣") debugencodedarray(utf8bytes, encoding.utf8)
any clues on i'm doing wrong?
the low range of cp437 contextual. think have proven 1-31 & 127 going need simple lookup .net interpreting them in control code context not in graphical context - i.e. ◙ (0xa
) \n
not equivalent unicode code point graphic.
Comments
Post a Comment