Security, templates and XSS prevention

DSC07845.jpeg

Ensuring that you don't fall victim to cross site scripting and other injection attacks takes vigilance, eye for details and especially some good naming habits.

Why

A small mistake can open the flood gates of attacks to your website. Common errors are cross site scripting attacks (xss).

Assumptions

Readers are assumed to be comfortable in writing or editing templates.

How

If we have the following request:

https://example.com/page/1234?foo=bar

For the dispatch rule:

{page, [ "page", id ], contoller_page, []}

Then in the template we can access the query arguments:

Hello {{ q.foo|escape }} your id is {{ q.id|escape }}.
Controller page said the resource id is {{ id }}.

You see  that q.foo and q.id are both escaped before they are shown. But id is not.

This is because q.foo and q.id both contain unfiltered content supplied by the user. This content could even be a complete script tag that transfers the users information to some other server or modified content on our site. To prevent this we apply the escape filter to the content so that input like:

https://example.com/page/1234?foo=%3Cscript%3Ealert%28%27a%27%29%3B%3C%2Fscript%3E

Will not add the following script tag to the HTML:

<script>alert('a');</script>

But instead the escape filter transforms it into the following safe HTML text:

&lt;script&gt;alert(&#39;a&#39;);&lt;/script&gt;

But why was id not escaped? The id provided by controller_page is the output of the mapping of q.id to a resource id, using the function m_rsc:rid/2 function. The output of this function is either a number or undefined, so doesn't need any further escaping when used.

Special care

There are more functions that can give unfiltered user input back to the template. Examples are m.req, m.identity (for example email addresses which can contain HTML unsafe characters), and exif or other metadata in the medium records.

A good habit is to prefix all untrusted content with a q, like this:

{% with id|default:q.id as qid %} {# q.id is unsafe #}
{% with m.rsc[id|default:q.id].id as id %} {# The m.rsc lookup ensure a safe mapping #}

And NEVER something like this:

{% with id|default:q.id as id %} {# DO NOT DO THIS #}

Why not?  In templates we have the shorthand:

{{ q.title }}

And templates might be rendered using API calls, which can pass structured data for the query arguments like the following JSON:

{ "id": { "title": "<script>...</script>" } }

It is clear that this is not something you want to echo directly in your templates.

Why is m.rsc safe?

As a rule of thumb all data inside m.rsc is safe to echo in the templates. This is because all data stored into resource records is sanitized. The sanitization removes or escapes all dangerous content. For exampe, HTML body texts have malicious content like scripts and iframes without a white-listed domain removed. All HTML attributes that might contain scripts are removed or cleaned, and css is also cleaned up to remove external content references.

An exception is content that is postfixed with ..._unsafe.  Like myprop_unsafe. This content is passed as-is through the sanitizer without any modification.