Argh... Unicode in HTML problems
We are trying to make our webapp more international, and currently support English, Spanish, Portugeuse, German, French, and a bunch of other languages. ISO8859-1 seems to be a good enough code set for these languages.
We will soon have to support Greek. o_O It's a completely different code set, ISO8859-7.
So, my question is, should we change code sets based on user locale, or just go with UTF-8? We're running all Java, so all Strings are internally Unicode.
If I try to switch to UTF-8, it seems like lots of characters change from being renderable into a square character, or a question mark. Is this just because my machine can't display those characters? Do I have to convert all Strings into HTML entities like so? ` Or can I just display them normally like so? –
So confusing...
AI Summary
10 Comments
hey i read this book, it had a section on unicode, not just technical details but over view, it might help you answer some questions you can conclude yourself
http://www.joelonsoftware.com/articles/Unicode.html
so far ive never had to deal with it, the funny thing is, i actually still deal with IBM EBCDIC here, the less famous father of ASCII
I know that if you don't have the proper character sets on your system then yes you would see a lot of [] [] [] []
hm so, is that good then, when you see those squares? maybe...
if it's UTF-8, and you're viewing in English, the change should be transparent
IIRC the reason for UTF-8 was to allow the extended encodings while allowing English to still line up ASCII style
ask the brothers who code www.watchtower.org. They're super international. Even intergalactic planetary.
nice, they made something called MEPS-16 which should work great! thanks
:(
but how do they translate the WEBSITE, (i'm not talking about the Mags and Literature)
LOL
hahaha
yikes thank goodness i haven't had to deal with this yet, sorry bud
by