vb.net - Converting char from CP437 encoding to UTF-8 encoding always yields the same character code, thus not the same character -


problem

i'm trying convert character and/or byte array cp437 encoding utf-8 (encoding.utf8). problem no matter try code yields same character code, since 2 encodings have different set of characters mapped character codes resulting char not same.

as example i'm trying convert character char code 3 cp437 (a heart: ) utf-8, , still want same character. when converting utf-8 still uses char code 3 results in control character called etx (see utf-8's codepage layout list of characters).


my attempts

here of attempts:

(general code)

public shared readonly cp437 encoding = encoding.getencoding("ibm437") public shared readonly bytestoconvert byte() = new byte(3 - 1) {3, 4, 5} 'characters: ♥, ♦, ♣.  public sub debugencodedarray(byval bytes byte(), byval encoding encoding)     dim resultingstring string = encoding.getstring(bytes)     messagebox.show( _             string.format("encoding: {1}{0}" & _                           "string: ""{2}""{0}" & _                           "bytes: {{{3}}}{0}", _                           environment.newline, _                           encoding.encodingname, _                           resultingstring, _                           string.join(", ", bytes)), _         "debug", messageboxbuttons.ok, messageboxicon.information _     ) end sub 

using encoding.convert():

dim convertedbytes byte() = encoding.convert(cp437, encoding.utf8, bytestoconvert) debugencodedarray(convertedbytes, encoding.utf8) 


using streamwriter, writing memorystream specific encoding:

using mstream new memorystream(16)     using writer new streamwriter(mstream, cp437)         writer.write(cp437.getchars(bytestoconvert))     end using      dim utf8bytes byte() = encoding.convert(cp437, encoding.utf8, mstream.toarray())     debugencodedarray(utf8bytes, encoding.utf8) end using 


writing file, reading , convert bytes (not optimal need code for):

file.writealltext("c:\users\vincent\desktop\test.txt", cp437.getstring(bytestoconvert), cp437)  dim filebytes byte() = file.readallbytes("c:\users\vincent\desktop\test.txt") dim utf8bytes byte() = encoding.convert(cp437, encoding.utf8, filebytes)  debugencodedarray(utf8bytes, encoding.utf8) 


results

all above attempts give same result:

utf-8 result

and if pass cp437 debugencodedarray() instead of encoding.utf8:

cp437 result


expected result

the result expecting is:

dim utf8bytes byte() = encoding.utf8.getbytes("♥♦♣") debugencodedarray(utf8bytes, encoding.utf8) 

expected utf-8 result

any clues on i'm doing wrong?

the low range of cp437 contextual. think have proven 1-31 & 127 going need simple lookup .net interpreting them in control code context not in graphical context - i.e. ◙ (0xa) \n not equivalent unicode code point graphic.


Comments

Popular posts from this blog

php - Vagrant up error - Uncaught Reflection Exception: Class DOMDocument does not exist -

vue.js - Create hooks for automated testing -

Add new key value to json node in java -