How to Prevent Attacks With Proper Input Handling (Part 1)

Input handling is an key aspect of secure web-design.  But what makes a good data validation/sanitation engine? The implementation depends greatly on the language and framework that your site is build on.  However, best practices across IT security topics maintain that “whitelisting” or “strict checking” is a more secure way to validate.  The Open Web Application Security Project (OWASP) is an online community that produces freely-available articles, methodologies, documentation, tools, and technologies in the field of web application security.  Below are some exerpts from their advisories on input validation .  After the quotes from OWASP, the article will use the terms “strict checking”  and “accept list”  to refer to whitelisting and “blocklist” to refer to blacklist .

OWASP Input Validation Cheat Sheet

Input validation is performed to ensure only properly formed data is entering the workflow in an information system, preventing malformed data from persisting in the database and triggering malfunction of various downstream components. Input validation should happen as early as possible in the data flow, preferably as soon as the data is received from the external party.

Input Validation should not be used as the primary method of preventing XSS , SQL Injection and other attacks which are covered in respective cheat sheets but can significantly contribute to reducing their impact if implemented properly.

It is a common mistake to use black list validation in order to try to detect possibly dangerous characters and patterns like the apostrophe ' character, the string 1=1, or the <script> tag, but this is a massively flawed approach as it is trivial for an attacker to bypass such filters.

White list validation is appropriate for all input fields provided by the user. White list validation involves defining exactly what IS authorized, and by definition, everything else is not authorized.

If it’s well structured data, like dates, social security numbers, zip codes, email addresses, etc. then the developer should be able to define a very strong validation pattern, usually based on regular expressions, for validating such input.

If the input field comes from a fixed set of options, like a drop down list or radio buttons, then the input needs to match exactly one of the values offered to the user in the first place.

The primary means of input validation for free-form text input should be:

  • Normalization: Ensure canonical encoding is used across all the text and no invalid characters are present.
  • Character category whitelisting: Unicode allows whitelisting categories such as “decimal digits” or “letters” which not only covers the Latin alphabet but also various other scripts used globally (e.g. Arabic, Cyrillic, CJK ideographs etc).
  • Individual character whitelisting: If you allow letters and ideographs in names and also want to allow apostrophe ' for Irish names, but don’t want to allow the whole punctuation category.

But why are are accept-lists specifically better than block-lists?

It is very hard to compile a complete list of everything that you do not want to allow in the input data.  However, since somewhere in your application flow you must be analysing the input to determine what data has been submitted to route it to use it.  This means you must already have a complete list of which GET parameters and POST array parameters are required in your application otherwise your application would not function.  This fact makes it easy to compile a complete list of these expected values, and therefore anything unexpected values should be filtered.  One exception to this rule is when a link to your site is altered by a third party.  For example Facebook Ad Manager uses conversion tracking so you can track the effectiveness of your ads.  To do this, Facebook ads a GET parameter and value pair to any ad URLs it posts in users feeds.  A Facebook Javascript script added to the site <head> then tracks those additions to the URL.  So, you must also take these third party additions to your URL into account when accept-listing.

But what harm can malformed GET and POST data actually do?

But some may say: who cares if you allow GET parameters that are not on the accept list?  If the software doesn’t handle those elements of the array, then your software doesn’t deal with them, and so they shouldn’t be dangerous. This is a wrong direction of thinking.  The example below shows how non accept-list variables can be included in an HTTP GET request that will attempt to invoke a function call in PHP.  If an accept-list is used to explicitly drop the request early in the process, this request will not be rendered innefective.

Example:

Here is an article from 2019 about ThinkPHP being exploited in the wild (CVE-2018-20062 ).

If the GET parameter doesn’t belong in your application accept-list, then the application should send drop the request parameters a HTTP 500 sever error or and direct the request to the homepage. Use a header because you want them to see that you have redirected them.  Sanitize at the start of the page request building process because your response will be snappy and it will be clear that you are not taking any guff.  The attackers will go play somewhere else.

Malformed GET and POST data is how SQL injection attacks happen.

So, it seem obvious that strict checking GET and POST parameter and value pairs is the most secure way to validate input coming into your web-server.

So why aren’t all web-applications developed like this?

One reason applications do not always validate request input early in the process flow is because many online code examples and tutorials show how to validate input at the last possible step before the data is inserted into the database.  While this does teach young developers that data needs to be validated, it does not follow best-practices within the greater context of web-application development.  From that point, if a dev-ops team lead does not enforce better security practices, this style of coding will persist into enterprise application development.  Also, none of the popular PHP frameworks come designed with this security minded approach to input validation built-in.  That should change.

Keep in mind that properly validating input with a “strict checking” and routing responses to 500 error pages displays a more active security posture and would be attackers may give up trying to attack your site.  This could result in less malicious traffic, leading to less log entries of malicious page requests to review, saving your threat detection team some effort.

Leave a comment

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.