Character encoding and character sets are not that difficult to understand, but so many people blithely stumble through the worlds of programming without knowing what to actually do about it, or say “Ah, it’s a job for those internationalization experts.
source: http://htmlpurifier.org/docs/enduser-utf8.html#findcharset