[nycphp-talk] SimpleXML - UTF8
Adrian Noland
anoland at indigente.net
Sat Oct 17 17:10:08 EDT 2009
I have this handy function I pulled from somewhere else. Does it help?
Apologies if the actual characters don't come across in the email.
/**
* This function was created to scrub additional html entities that are
not in the PHP get_html_translation_table
* Currently bug #34577 in the bugs.php.net database.
* a1 is a list of current html entities that are commonly appearing in
the listing description that are not escaped
* a2 is most of the entities to either an accepted format, correct
html-entity, or with a blank space
*
* @param string $string string to scrub
* @return string $string clean string
*/
public static function xmlStringScrub($string) {
$a1 = array("�","�","�","�", "�","�", "�", "�", "�", "�",
"�","�","�","�","�", "�", "�");
$a2 = array(".","-","•","", "'","'", '"', '"', "-", "-", ",",
"^",",","","€", "®", "™");
$string = htmlentities($string, ENT_QUOTES);
$string = str_replace($a1, $a2, $string);
$string = utf8_encode($string);
return $string;
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/talk/attachments/20091017/23d5418b/attachment.html>
More information about the talk
mailing list