Monday, 24 September 2007

Encoding disaster in Asp.Net

Yesterday I created a few pages that contain Russian text and uploaded it to my hosting provider which is located in The States.

The Russian letters turned into Kryakozyabras (weird text with umlauts). Such things happen when you try to read Russian letters with ASCII encoding (or just cut off the 8th bit).

Further study revealed the following facts:
The problem is only with the static text; dynamically added text appears fine.
The problem is with User controls and regular pages; master pages work fine.
The problem happens with all foreign hosting servers.

What's been really weird is:
Viewing the file in a built-in viewer of my ftp client showed that the file is fine.
Response encoding as well as charset was UTF-8.
The hex editor showed me that the umlauts were actually 2-byte, but different from Russian 2-byte letters.

And the answer is..
.. Visual Studio was saving my files in the default system encoding, that is, Windows-1251. While it was fine for the runtime on my PC, it certainly wasn't for the remote server. So, with all that talking about internationalization, Visual Studio doesn't make its own files encoded properly by default. What's even worse, I couldn't even find a global setting to change that -- I had to resave each file manually.

No comments: