A week ago, at work, I needed to work with some self-signed SSL certificates and that’s when I realized that a lot of us didn’t have a good understanding of what certificates are and what their use is. So I decided to take some time out to write about this topic. But, before we jump onto complex topics like what SSL certificates are and what exactly are they used for, it’d be great if we start with some prerequisite knowledge. Let’s start with understanding encryption.
Encryption is a technique of encoding a simple data (also known as the Plaintext message) in such a way that only the person who is intended to receive the message should be able to read it. How is it possible you ask? The answer is ‘key’. Consider the lock you put on your door when you go out of your house. The lock prevents any trespasser to intrude. Only the holder of the key to that lock(you, in this case) can access your house. Similarly, the encryptions also have a key and only the holder of that key could decode the message back to its original form. Without the key, both the house and the encoded message are meaningless to have.
When dealing with computers, we get to see two different kinds of encryptions. The ‘Symmetric encryption’ and the ‘Asymmetric encryption’. Let’s try to understand the difference between the two.
Consider the following message:
I love cats.
And suppose I have an encryption scheme which changes every letter of the message to its next letter. So the ‘I’ becomes ‘J’, the ‘l’ becomes ‘m’, the ‘o’ becomes ‘p’ and so on. After encrypting the message, it looks like.
J mpwf dbut.
Now, only if you knew that the encryption has changed every letter of the message to its next letter, you’d be able to decrypt the message to its original form. If you notice carefully, you’d find that all that the encryption has done here is to “add” 1 to every character of that message and “subtracting” 1 from the encrypted message would change it back to the original plaintext message. 1 here is the key used for encryption. Had we used 2 as the key, the message would have looked like:
K nqxg ecvu.
If you do not have the key, you cannot decrypt the message. The encryption scheme I have used in this example is easy to figure out but the ones used in Secure Sockets Layer(SSL) Protocol are far more complex than this but work in a similar way.
Now let’s see what an Asymmetric encryption is.
This encryption requires two keys for it to work. One of the keys is used for encryption and the other its counterpart is used for decryption. These keys come in pairs and are known as the “Public” and the “Private” keys. If you had a message to send to me, you’d ask for my public key. Once you have my public key, you can encrypt the message using my public key and send the encrypted message to me. On my end, I can decrypt the message using my private key.
Since everyone who wants to communicate with me would require my public key, it can conceptually be said that I need to distribute my “public key” to the “Public” and hence the name “Public Key”. Its counterpart must be kept to myself and hence the name “Private Keys”.
You can click the links Symmetric Encryption and Asymmetric Encryption to learn more about them. I’ve kept the discussion just enough to understand how SSL certificates work. Now let’s understand the topic of “Hashing algorithm”.
Hashing functions are mathematical operations using which we can represent a data of any length by a fixed size length. For example, the message “I love cats” can be converted to a string of size 256 bit using a hashing function called SHA256. In case you’re wondering what does SHA means, it stands for ‘Secure Hash Algorithm – 256 bits’. Similarly, SHA512 would convert the provided data into a string of size 512 bits. The point to note here is that the length of a hashed string is independent of the length of the given data. So the hash of a string as small as your name would have the same length as of the hash of a file of size 20 Terabytes, provided that the hashing algorithms used in both the cases are same. Let’s see if it’s really true.
This is what a hash of the string “7_R3X” looks like. Now let’s look at a hash of a file of size 35 MBs.
Another point to note is that the hashing algorithm creates a different hash for every different data that is provided. Let’s try it for ourselves.
Changing small ‘c’ in the word ‘Cats’ to capital ‘C’ results in a whole different hash.
I believe this should be enough for us to get started. So, What is a certificate?
What is a certificate?
Let’s take a real-life scenario and try to understand the importance of certificates and their uses. Here’s the scenario:
You, sitting at your home in warm and cozy bed, enjoying the winter with a cup of hot coffee trying to connect to this website. As we all know that the internet is not as simple as it seems. You never get connected to this website directly. Instead, your traffic hops through many nods on the internet before reaching this server.
The problem here is simple. It’s practically impossible for you to see which server you’re getting connected to, how’d you know if the server you’re talking to is really Crackerscreed.org’s server or someone trying to disguise as Crackerscreed.org?
It’s possible that one of the intermediary nodes is a hacker trying to sniff your traffic. This is where the SSL certificates come to rescue us. The SSL certificate acts as a container for the public key of this server. Beside the public key, the certificate also contains the date from which the certificate is valid, the date up to which it is valid, the domain name for which it is valid (Domain name in this case is crackerscreed.org), the information regarding the certification authority who issued this certificate(We’ll discuss Certification Authorities later in this blog), etc..
Upon receiving the certificate, your client (the web browser) extracts the public key from it and uses it to encrypt the data. Well, that sounds like a solution. Doesn’t it? But wait! there’s still a problem! What if the bad guy sitting between you and this server sends his certificate and his keys to you. In such a case, you’d end up encrypting the traffic with his keys and he’d be able to decrypt the data using his private keys.
So, what do we do now? Is there a way for you to make sure that the certificate you receive actually belongs someone who they claim they are? Well, the answer is yes but not directly. It is sometimes called ‘the concept of mutual trust’. Let’s try to figure out what it means.
Imagine you want to meet a person by the name Peter but you’ve never seen or met Peter before. Now someone comes to you and claims that he is Peter. What would you do? Is there a way for you to figure if the person is really Peter or a phony? Does the situation sound familiar? Yes, it is exactly the problem we just faced when trusting the certificate of crackerscreed.org!
Now imagine that a close friend of yours, say Tiffany comes to introduce you to Peter. Since you trust Tiffany there’s a fine chance that you’d trust Peter too. That’s exactly the solution to our problem of certificate too. We need to find that ‘someone’ we can trust upon who would vouch for crackerscreed.org’s certificate and that ‘someone’ happens to be a Certification Authority. A certification authority is an organization who digitally ‘signs’ the certificate of a web server (like crackerscreed.org’s certificate) only after verifying that the certificate really belongs to the claimed domain. A CA’s private key is used for these signatures. These digital signatures are computer’s way of showing that they trust an entity! The methods used for this verification could be physical(Such as meeting the Owner of a domain in real life) or logical (Such as DNS TEXT challenges). The process of getting a certificate signed by a CA can only be completed if someone actually owns the domain and are otherwise impossible to get through. Once a certificate is signed, these signatures can be verified using the CA’s public key.
But how do you trust the certification authority’s key? Nice question! The web browser that you’re using comes with a set of certification authorities that it’ll trust upon. Mozilla for Firefox, Google for Chrome and Microsoft for Internet Explorer (Now Edge) decides this set of certification authorities that these browsers would trust upon. If need be, you can also add your own set of certificates to the browser’s white-list.
In fact, this chain of trust is not limited to just 2 entities. This could go even longer. For the case of our friends let’s say you trust Tiffany, who asked you to trust Jack, who asked you Rosy, who in turn would introduce you to Peter. Similarly, the chain of trust in the SSL certificates could also be extended.
Now, take a look at the pic below:
Using Firefox I visited Duckduckgo.com, clicked the little green lock on the left side of the address bar, clicked on “Show connection Detail” button (the little arrow symbol), and clicked on “More Information”. This opens a little security window with Duckduckgo.com’s information. Click on the “View Certificate” button and go to ‘Detail’ tab. Under the ‘Certificate Hierarchy’, you should see this chain of trust. Here, Duckduckgo’s certificate is signed by ‘Digicert SHA-2 Secure Server CA’ whose certificate is in turn signed by ‘DigiCert Global Root CA’. The one on the top i.e. ‘DigiCert
Global Root CA’ is the one on which Firefox trusts.
If you want to see the complete set of Certification Authorities, or CAs for short, on which Firefox trusts, here’s how you do it. Go to Preferences, go to ‘Privacy and Security’, go to ‘Certificates’ section and click ‘View Certificates’. This should open a Certificate manager window. Now, select any name and click ‘View’.
This should give you details regarding that certificate. The ‘Fingerprints’ that you see are actually hashes of that certificate.
Self Signed Certificates
We have seen earlier that we needed that ‘someone’ in order to trust a stranger. Now, consider a situation where you go to Peter, a stranger, to meet. You ask Peter if he has someone in your circle of friends who could introduce him to you, someone who would vouch for him. To this Peter replies that he would introduce himself to you!!! That’s fishy, isn’t it? Basically, Peter, the stranger told you to trust him because he is asking you to.
This is exactly what happens with computers. A self-signed certificate is when an entity signs its own certificate. Since your web browser doesn’t trust this entity, it pops error messages saying that “The connection is insecure” or that “The connection is untrusted”.
It is not advisable but if you really trust that computer, you can add the computer’s certificate to the client’s list of trusted certificates. Upon doing so, you wouldn’t need someone to vouch for Peter because you have declared Peter as your friend.
Now let’s try to understand this entire process in detail, with some technical terminologies.
- In the SSL protocol, which is now known as Transport Layer Security (TLS) (“SSL” was the protocol name when it was a property of Netscape Corporation), the client wishes to talk to the server. It sends a message (“ClientHello”) which contains a bunch of administrative data, such as the list of encryption algorithms that the client supports, the cipher suites it supports, the compression methods, etc..
- The server responds (“ServerHello”) by telling which algorithms will be used.
- Then the server sends his certificate (“Certificate”), possibly with a few CA certificates in case the client may need them (not root certificates, but intermediate, underling-CA certificates).
- The server checks what the highest SSL/TLS version is that is supported by both of them, picks a cipher suite from one of the client’s options (if it supports one), and optionally picks a compression method.
- The client, upon receiving server’s certificate, verifies it and extracts the server’s public key from it. The client generates a random value (also known as “pre-master secret”), encrypts it with the server public key, and sends that to the server (“ClientKeyExchange”).
- The server decrypts the message, obtains the pre-master, and derives a secret key and the MAC from it.
Oh! Now, What is a MAC? MAC stands for Message Authentication Code. It is similar to a hash function but with a little difference. The MAC is generated and verified using just one key. This key uses the ‘Pre-master’ key.
- On the client’s end, the client performs the same computation. If the MAC check’s out the client sends a verification message (“Finished”) which is encrypted and MACed with the derived secret key.
- The server verifies that the Finished message is proper, and sends its own “Finished” message in response.
- At this point, both client and server have the “Secret Key” and know that the “handshake” has succeeded. Application data (e.g. an HTTP request) is then exchanged, using the symmetric encryption and MAC. The key used for symmetric encryption is the “Secret Key”.
There is no public key or certificate involved in the process beyond the handshake. Just symmetric encryption (e.g. 3DES, AES or RC4) and MAC (normally HMAC with SHA-1 or SHA-256). Why, you ask? Because Symmetric algorithms are way faster than the Asymmetric algorithms. After all, who doesn’t like surfing fast and secure internet?
As the famous saying goes,
A picture is worth a thousand words.
To sum it all up, here’s a pic for you to enjoy.
There are certain cases where the server also needs to verify that the client is who he claims to be. For such cases, the client also sends its certificates to prove its identity. The process is almost similar to the one we’ve seen. Here’s how it goes.