Upgrading Character Encoding
Audience
This section is only relevant to OpenX administrators who are upgrading from OpenX 2.4 or earlier to OpenX 2.6. If you are installing OpenX, or you are performing a different upgrade, you do not need to read this.
Additionally, if all manager users (formerly known as Admin/Agency accounts) in your OpenX installation have been using the interface in English (or another language with a pure ASCII character set) then this change will not affect you.
The Problem with encoding
In OpenX 2.4 and earlier, language packs were provided for a number of languages. Each language pack was encoded in a different character encoding, and set the user's browser to use that encoding (in the case of 2.0.11 and lower some languages did not set the encoding to use, and relied on the browser to set the encoding).
The encoding used by the browser was used as the encoding for the data stored in user-entered fields in the database
In OpenX 2.6 we have had to change the encoding of these language packs to UTF-8. This was required by (among other things) the change to users and accounts; you might now have the situation where items could be edited by different users who may have different selected languages. Without switching to UTF-8 encoding, any data created by one user and then modified by another with different language settings would have become corrupted.
We have now converted all OpenX language packs to UTF-8 encoding, so it shouldn't matter if multiple users (with different languages) edit each others' data.
What this means
For the majority of users the upgrade and data conversion should be completely automatic.
In cases where automatic detection is not possible we do not modify any user-data. However, we have provided a tool which can be used to change the encoding of user data after the upgrade.
Do I need to do anything?
Probably not. The upgrader should have converted any applicable data in your database into UTF-8 encoding.
Appendix
The upgrade system will apply the following conversion rules based on the manager's selected language:
| Account's language | Encoding converted from |
|---|---|
| Chinese (big5) | big5 |
| Czech | iso-8859-2 |
| French | iso-8859-15 |
| Hebrew | windows-1255 |
| Hungarian | iso-8859-2 |
| Korean | EUC-KR |
| Polish | iso-8859-2 |
| Portuguese | iso-8859-15 |
| Russian (cp1251) | windows-1251 |
| Russian (koi8r) | koi8-r |
| 2.0 upgrades only | |
| Chinese (gb2312) | gb2312 |
