Signing HTML blocks
http://ajdiaz.me/doc/2010/04111-signing-html-blocks.txt
Version: 2010-04-11


A couple of years ago I was talking with my colleages in those years about
security in some websites. We were not talking about SSL (which is, by the
way more popuplar now), because SSL only works at connection level. With SSL
you can guarantee that the communication is reliable (in terms of
authenticity) and that the endpoint server is actually who pretend to be.

But SSL hides a shameful secret, a flaw in the design which can provoke,
eventually, a big security problem. This neglected detail is too evident that
no one think very much about it: “SSL doesn’t guarantee you anything
about the content that you are viewing”.

We can build an imaginary experiment. Let’s suppose that a big e-commerce
web site which has payments enabled for their customers wants to fire an
employment. That employment is a good qualified programmer with access to the
site source code. Before they fired the worker, he modify the source code to
add a very small piece of code (buried in a millions of lines of e-commerce
code) which just change a little bit thing. The action of the payment HTML
form now send credit card data to an anonymous web service running in some
weird country.

Now, let’s do another exercise in imagination. Suppose you are an
unsuspecting user who loves products of our company. You buy a couple of
goods, and probably you pay with your credit card… Ops! Back a moment…
Now your credit card data is stored in a probably not very safe database in
one server located in our Weird Country, ready to be sold to anyone who can
pay for that kind of information (and I can assure you that they aren’t
good people).

In this case SSL is green. Is the real server with a trust communication. But
in this case SSL doesn’t help us to avoid the crime. That’s the reason
why we need content signing eventually.

Thinking about this problem I create a way to facilitate this implementation.
The core of the idea is the attribute data-signature. This attribute can be
used in any HTML5 block, and it’s a signature of the HTML representation of
all childs of the block which has the attribute. So, for example in the
following code:

<div id="content" class="myclass_for_stylish"
data-signature="eWVzIG1hcnRoYSwgdGhpcyBpcyBub3QgYSByZWFsIHNpZ25hdHVyZQo=">
  <!-- This is a normal comment -->
  <p>Some paragraph here</p>
</div>

The signature is valid for the HTML <p>Some paragraph</p>. We don’t need to
sign the comment (nothing important could be saved there). The signature
algorithm is, right now irrelevant. We work on that point some paragraphs
below.

Of course, nested blocks can be signed also.

With this approximation, we are sure that the content of the div block is
genuine, because we assume that the developer has no access to master keys to
sign critical data. In out store example, the critical data is just the form
block, and needs to be hard coded, but, anyway, this is usually a fixed
string in a template.

Finally we need to talk a little bit about the algorithm to sign. We can use
any public key based algorithm, and the only problem is how can we check that
the signature is right. Well, there are a lot of solutions for that problem.

One solution could be that the browser (or browser extension ;)) validate the
signature looking for the public key associated with the domain in a public
CA (or web of trust model).

So, this is a simple way to validate HTML blocks and put more security in web
sites, but is that method convenient?