[nycphp-talk] filter_input misconceptions
Gary Mort
garyamort at gmail.com
Thu May 22 17:35:25 EDT 2014
On 05/22/2014 05:02 PM, Anthony Ferrara wrote:
We learned **years** ago that this doesn't work. Magic_quotes was
removed because of that. Register_globals was removed. ETC.
Yeah, and then filter.default was added back in AND in such a way that
it can't be changed I guess "we" didn't really learn that it doesn't
work. :-)
>
> The short of it, working around that setting is not going to work.
> Many people tried working around magic_quotes. Fail. The only actual
> solution is to set it to not do anything.
Except that with filter_input you can actually get access to the
original unfiltered data without having to do all sorts of checks. Just
call filter_input($type, $filter) with whatever filter you want,
including the raw filter which doesn't do anything.
Unfortunately, at the moment there are cases where
input_filter(INPUT_GET) will not return data and $_GET[] will return
data. You actually have to call:
if (!input_has_var(INPUT_GET, $varName) { isset($_GET[$varName]); }
$var = filter_input(INPUT_GET, $varName, $filter);
While that is not elegant I don't know of any instance where doing the
above will not result in retrieving the correct data AND using $var =
$_GET[$varName] will retrieve the correct data. Can you come up with a
concrete example?
>
> Anthony
>
> On Thu, May 22, 2014 at 4:44 PM, Gary Mort <garyamort at gmail.com> wrote:
>> It seems there are some misconceptions on the filter_* API. Recently I was
>> contacted by a colleague when his website went off kilter. All of the
>> sudden all the variables had extra html encoding charectors in them....and
>> then since they were encoded a second time when displayed they would have
>> even more.
>>
>> This was on a server I had worked on a few months previously and this was
>> not happening. So I took a look at the configuration and discovered that
>> filter_default had been changed. It turned out that PHP on the server had
>> recently been upgraded from the CentOS repositories and the default settings
>> changed.
>> http://us3.php.net/manual/en/filter.configuration.php#ini.filter.default
>>
>> Looking into it a bit more, I experiemented with some different options on
>> various settings which control the creation of the super global variables,
>> after which I decided that it is better to use $var
>> = filter_input(INPUT_GET, 'myvar', FILTER_UNSAFE_RAW); then $_GET in the
>> future[though of course if in a framework which provides an interface to the
>> http variables it is better to use the framework to be consistent]
>> Note, FILTER_UNSAFE_RAW - this is not a security decision, it is a stability
>> decision.
>>
>> First off, if you use $_GET then you ALSO are using the filter_input API.
>> All global variables are populated by passing them through filter_input. By
>> design, the default filter will be FILTER_UNSAFE_RAW however there is no way
>> to change this from within your PHP code. It can only be set before the
>> execution of the PHP script[either in the php.ini file or, with apache, you
>> can set a custom ini variable in .htaccess].
>>
>> Presuming my colleague gave me the correct information on where the upgrade
>> came from, it seems that the latest CentOS PHP packages instead use
>> FILTER_SANITIZE_FULL_SPECIAL_CHARS
>>
>> Since you[and by you I mean anyone publishing PHP code where they can't
>> control the server configuration it will be executed on] can't get away from
>> it being used, the best you can do is force it to do exactly what you want.
>> IE if you want raw data, use filter_input and FILTER_UNSAFE_RAW so you make
>> sure to get what you expect to get, and not something set by the server.
>>
>> In addition, the global variables $_GET, $_SERVER, $_ENV, $_SESSION,
>> $_COOKIE, $_REQUEST, and $_POST simply can't be trusted. Only filter_input
>> will give you access to the true data for 4 out of those 7 variables. It
>> does not give you access to $_SESSION and while it does give you access to
>> INPUT_POST but there are a couple edge cases where it will not provide the
>> post data.
>>
>> The php.ini settings filter.default, track-vars, and variables-order can all
>> change what is stored in the super globals.
>> http://us3.php.net/manual/en/filter.configuration.php#ini.filter.default
>> http://us3.php.net/manual/en/ini.core.php#ini.track-vars
>> http://us3.php.net/manual/en/ini.core.php#ini.variables-order
>>
>> Without doing detailed checking of the various combinations from inside the
>> code, there is no way to tell which of those variables actually has data and
>> what filtering has already been done to them. In addition, $_SERVER may or
>> may not include both $_SERVER and $_ENV variables[running google app
>> engine's dev server they will be combined, when you are on the actual
>> production server they are not]
>>
>> With auto-globals-jit the $_SERVER and $_ENV variables may or may not even
>> be available.
>> http://us3.php.net/manual/en/ini.core.php#ini.auto-globals-jit
>>
>> Thanks to request-order, the order of variables in $_REQUEST can be anything
>> - so if I want a specific combination of possibilities, I retrieve that
>> combination rather then hope that request-order was not changed from the
>> default 'GPC'
>> http://us3.php.net/manual/en/ini.core.php#ini.request-order
>>
>> $_POST is an even more interesting. If you disable $_POST - either by
>> setting variable_order to 'EGCS' to exclude posted data then filter_input
>> will not return any data for posted variables. auto_glabls_jit provides an
>> even odder edge case. With just in time creation enables, $_POST will not
>> be created when PHP is running unless you try to access something from the
>> array. This also affects whatever internal structure filter_input() uses
>> and filter_input() does not trigger the creation of post variables. So if
>> you call filter_input() after calling something like
>> isset($_POST['anynonexistantvariable'] it works. If you call it before it
>> does not.
>>
>> To safely deal with post you need to parse the post data from php://input ..
>> and for php://input is inconsistent in that depending on compile options it
>> may be possible to read it multiple times, or it may be deleted after being
>> read.
>>
>>
>>
>>
>> _______________________________________________
>> New York PHP User Group Community Talk Mailing List
>> http://lists.nyphp.org/mailman/listinfo/talk
>>
>> http://www.nyphp.org/show-participation
> _______________________________________________
> New York PHP User Group Community Talk Mailing List
> http://lists.nyphp.org/mailman/listinfo/talk
>
> http://www.nyphp.org/show-participation
More information about the talk
mailing list