[nycphp-talk] iterating through a multibyte string
Ben Sgro
ben at projectskyline.com
Thu Jan 14 16:58:11 EST 2010
Mike,
I've seen the exact opposite argument, that the family of functions
str_* are faster than regex.
And, um, PHP is implemented in C, so isn't all the work done in C at the
end of the day?
The str_* methods will be optimized, the (dynamic) regex will not.
I'm confused by your logic ...
- Ben
Michael B Allen wrote:
> On Wed, Jan 13, 2010 at 2:24 PM, Chris Snyder <chsnyder at gmail.com> wrote:
>
>> On Wed, Jan 13, 2010 at 1:54 PM, Dan Cech <dcech at phpwerx.net> wrote:
>>
>>
>>> but still can't beat preg_split, most likely because of the overhead
>>> involved in overwriting $rest on every pass through the loop.
>>>
>> Regex wins a speed test. Nice thread!
>>
>
> The more work done in C the faster it will be. So even though there is
> the overhead of compiling the regex expression, it frequently turns
> out to be faster. This is way I always use regex for tokenizers when
> parsing complex data (I did this for a Creole Wiki markup
> interpreter).
>
> Note that you can also convert the string to something like UCS-2BE
> using iconv and then each character will occupy exactly 2 bytes in the
> resulting string allowing you to iterate over it in the mostly
> convential way. This might turn out to be faster but I'm not sure.
>
> Mike
>
>
More information about the talk
mailing list