[nycphp-talk] Determine the text language
Carlos A Hoyos
cahoyos at us.ibm.com
Fri Nov 9 09:36:36 EST 2007
> How can I use regular expression to determine the text language, is
> the selected text is English, Arabic, Hebrow, .....etc
You can't use a regular expression to determine language - or at least not
a very simple one. Each language has certain particularities, such as
letter combinations, and statistically you can test enough of these to get
an accurate determination.
I'm not aware of any php tools (but watch me be corrected in this list ;-)
--- I suggest you look at the language guess tool here:
http://languid.cantbedone.org/ It's not in php but you should be able to
invoke it via the command line, or rewrite it in php.
Carlos Hoyos
More information about the talk
mailing list