Our current Internet doesn't have much in terms of security when you look at how our computers communicate with each other. The most widespread technology for securing communication is called Transport Layer Security (TLS (previously known as Secure Sockets Layer (SSL)). This technology is not as widely deployed as it should be, since it can be complicated to set up correctly. But even when set up correctly there are many issues with TLS that make it unsuitable for the long run. Another aspect of transport security is how to look up names in a safe way. DNS is the protocol used for name lookups, but DNS has a lot of issues as well.
At the end of the day we need a new protocol for transport security that really provides security, authentication and integrity for everyone. But in order to get there we first need to talk about the issues with TLS and DNS and make sure what comes next actually fixes these problems.
Problems with TLS
TLS stands for Transport Layer Security and it is the protocol that tries to ensure confidentiality and authenticity for many other protocols. The most well known of these is HTTPS, which is used when connecting to secure web sites such as those for banking and E-commerce. TLS does several different things - it tries to make sure no-one can read the traffic that is being sent between clients and servers. It also tries to ensure that you are talking to the server you think you are talking to. This is important since without it, it would be possible to do man-in-the-middle attacks against connections.
TLS is a complicated protocol that really consists of several protocols and choices of algorithms. When TLS was first designed, the consensus in the security community was that something called algorithm agility is a good idea. Algorithm agility is the idea that a protocol should be generic and contain many different types of cryptographic algorithms so that it's easy to switch to other algorithms. TLS also implements this in a way so that servers and clients can negotiate a combination of algorithms that both sides are comfortable speaking. However, the result of this is that TLS now contains a large amount of algorithms and ciphers with a large variability in how secure they are. Also, many of the ciphers have been completely broken and have to be turned off. Thus, a typical TLS implementation contains huge swathes of code that is unused and in many cases actively dangerous. TLS has also often been subjected to so called downgrade attacks - where an attacker forces the connection to use the weakest algorithm possible.
Finally, there are other choices that were made that we now know are bad ideas in cryptographic protocols. The two largest problems are likely compression and MAC-then-Encrypt. If TLS was designed today, both of those things would work differently. But since these things are integral to the design of the protocol, it can't be changed without making significant changes to all TLS implementations - and TLS also have to retain backwards-compatibility for a long time, which means that none of these issues can be completely resolved.
X.509 and Certificate Authorities
The way the TLS implementation1 verifies the identity of the server you are talking to is by using a protocol called X.509. This protocol is based on public key cryptography and certificates that are issued by entities called certificate authorities (CA). Basically, if you want to have a website that uses HTTPS you go to a CA, pay them some money and prove that you own the domain you want a certificate for. Then the CA will do a digital signature on your certificate, which endorses that your certificate is really the right certificate for the web site. Then, when a TLS connection arrives, the server will send the certificate with the signature to the client. The client will look at the signature and verify it, and then it will make sure that the signature was done by a trusted party. But how does a client find out whether a signature is by a trusted party? Well, several hundred trusted CA certificates are shipped with your browser and operating system, and these are all automatically trusted. They range from companies like Verisign to the Chinese government.
There are several issues with this system - the first one is that you have to trust in the hierarchy, so if someone can take over any entity at the top of the hierarchy, that means they can impersonate anyone using TLS. The second problem is that there are no restrictions on which sites or servers can be endorsed or certified by which CAs - any CA on the planet can certify a certificate for thoughtworks.com or olabini.se or nsa.gov. That becomes a problem if different CAs have different goals or needs.
Being a CA is also big business. Getting a certificate costs money, which means that most people don't bother. However, that makes the security of the whole Internet worse - thus, getting certificates has to be so cheap that anyone can do it.
There are other issues with this system as well. For a service provider it is quite complicated to set up. It is also a big hassle to replace and renew certificates when they expire or are compromised. After the heartbleed vulnerability, it took a long time for services to replace their certificates, and many still haven't done it. The reason is a combination of how hard it is and how costly it can be. If you have to balance effort and money against security, security will often lose. This is bad for everyone.
Ideally, you would have a system where you can switch out certificates every few days - it should be semi-automatic and so easy to do that everyone will do it.
X.509 is defined using another protocol called ASN.1. This is a data description language that is well known for being large, complicated and tricky to get right. Many TLS implementations have had serious bugs in their ASN.1 parsers over the years.
Finally, the CA system is based around the idea of centralization. Centralizing trust and centralizing functionality - and these things are fundamentally the opposite of what we need for a safe and secure Internet. Sadly centralization has a tendency to foster monopolies as well, which makes them hard to replace. One of the reasons we are still in our current bad situation is because there is so much money invested in keeping the status quo.
Because of the situations outlined above, TLS implementations have grown extremely complicated. Most of them contain several code paths that do the same work - or at times that should do the same work but don't. Since there are so many features and extensions to support, the code quality of these implementations is not fantastic. 2014 was a year where we saw extremely serious bugs and attacks against every single major TLS implementation out there. Everything from heartbleed to "goto fail" happened. And many pundits talked about the reasons for all these issues. But I think the answer is easy - TLS is too big and complicated. Software engineers have known for a long time that bugs grow with size and complexity of a code base. Security engineers have known for a long time that the only way of having a reasonable chance of making something secure is to make it small, minimal and easily understood. TLS and TLS implementations are neither of those things.
Problems with DNS
When connecting to another machine the first step your computer usually does is to do a DNS request - the purpose of which is to translate a human readable name like thoughtworks.com to an IP address (like 220.127.116.11) that can be used to actually route information on the Internet. However, there are several issues with how this protocol works.
The protocol will leak information about what sites you are visiting - so someone that is listening on traffic between you and your DNS server can put together an equivalent of your browser history very easily. In some cases this is not a problem, but in other cases it can be a significant issue - especially if you are in a country that practices censorship or prosecution for accessing material that should be public and available.
The standard DNS protocol also doesn't contain any provisions to prove that the information is correct or up-to-date. In fact, a DNS server can return whatever they feel like and your computer will treat it as truth. This is one of the major ways Internet censorship works. It can also be used to inject malware. Simply make a DNS request return an IP address you control instead of the real one - and then you can proxy the requests to the real IP address while changing or adding information in between. This is one of the main reasons that we need something like TLS. Fundamentally the TLS certificate authority system exists so that certificates can make pronouncements about the names that DNS deals with, and the reason we need it this way is because DNS is so easily manipulated and hijacked.
The DNS protocol is completely hierarchical. This means that thoughtworks.com is owned by Thoughtworks, and we can change anything under this domain - for example studios.thoughtworks.com. However, that also swings the other way - the .com domain is owned by Verisign, and in theory they can change the information for thoughtworks.com however they want. All top level domains (like .com, .org, .se) are actually organized under one single top level domain which is the empty string. The National Telecommunications and Information Administration (NTIA) owns and operates this, and in theory could modify the information for any DNS name in the world.
This hierarchical vulnerability is not just theoretical. In fact, American Law Enforcement agencies have taken down several hundred domains without due legal process using this technique.
So let's recap - DNS is readable by anyone and easy to modify for anyone and it is also hierarchical, reinforcing the existing Internet power structures.
DANE and DNSSec
Someone might read this and ask the question - doesn't Domain Name System Security Extensions (DNSSec) solve all this? And when it comes to the above TLS problems, isn't that something DNS-Based Authentication of Named Entities (DANE) can solve? DANE is a protocol that allows you to put cryptographic key information into the DNS system - so instead of trusting a CA, you can just go and look up the key for a server in the DNS system. This is great, and could replace the CA system completely. However, that just means you move the trust somewhere else. Specifically, DANE requires you to use DNSSec.
So is DNSSec a good idea? DNSSec is a protocol that allows you to sign your DNS records. In theory this solves some of the problems enumerated above - specifically that it is easy to forge DNS records. It doesn't solve the problem of information leakage. And more crucially, DNSSec is hierarchical in the same way as the DNS system itself - thus it actually strengthens the problem with hierarchy. There are also some other issues with DNSSec that are stopping adoption. One of them is that DNSSec actually leaks more information than DNS about what DNS names a server has. Another is that the protocol is complex and not very many providers support it yet.
Something like DANE would be great, but the dependency on DNSSec makes it a non-starter. And DANE with only basic DNS is also a non-starter, since there is no security in a system like that.
We need a cryptographically secure, decentralized naming system that can provide the basis for routing and secure transport of most traffic on the Internet. The current solutions (primarily DNS and TLS) are hierarchical and centralized. They are complicated. And since TLS asserts things about DNS, there is a fundamental divide between the data from the two systems. The TLS identity assurance is fragile and hierarchical, and the current system also makes changing and renewing certificates a very hard task.
So it is time to work on new protocols that solve these problems. In my mind, it is crucial to not separate TLS from DNS - they are intrinsically linked and any protocol trying to solve this problem should solve the problems with both protocols in one stroke.
In the next article I will talk about some of the existing solutions to these problems and what their benefits and drawbacks are.
1 Compression and Integrity Checking
As mentioned above in this article, there are several severe problems with the complete TLS protocol. In my mind the two main ones are compression and how integrity checking happens. The problem with compression is that if the plain text of something being transferred is compressed before encryption, it is possible to find out information about the plain text by looking at the size of the encrypted content. The only requirement is that it is possible for you to put in text that will be returned in a response, which can easily be done using cookies for example. This is sometimes called a compression oracle, and was the main break in the CRIME attack in 2012 and in the BREACH attack in 2013. The only real solution for now is to turn off compression - so this is another part of TLS that exist but can’t safely be used anymore.
Disclaimer: The statements and opinions expressed in this article are those of the author(s) and do not necessarily reflect the positions of Thoughtworks.