Good afternoon, my name is Eirik and I will be talking to you about encryption on the internet.
In particular, I will talk about what encryption is, what it is used for, and how and when you can trust it.
So to start, what is encryption?
Encryption is the process of transforminginformation,[SLIDE](using some specified method), to make it unreadable to anyone except those possessing a key (typically a password or a keypad from your bank serves as the key). This key, allows an intended user to reverse the encryption, or as we say, to decrypt the unreadable information.
Today it is used everywhere on the internet;
But in particular, sites that process your credit card numbers or other sensitive data, need to be able to provide a secure channel, through which an indecipherable exchange can be made. But the internet, like traditional telephone connections, there’s no channel that secure by itself. We know there might be people eavesdropping on our exchange, so we want to encrypt the information to make it hard for them.
By contrast, when you give out your credit or debit card numbers over the phone to pay for something, that channel is almost never secure. Wiretapping would reveal your every word. But most crucially, the storage of your data is horrific: if the person who authorizes the transaction is some dodgy employee, there’s nothing stopping him from using your card. This does not in the least compare to what internet can offer you in terms of security. For instance there would be no real need for anyone to see your credit card details. Although the fact that the internet can provide security, doesn’t mean it actually offers anything at all.
This leads me to the topic of how and when you can trust encryption on the internet
To discuss this, we need to go into a little bit of background. Computers and all electronic devices listen and respond in a programmed fashion. They will send data to anyone satisfying their predetermined requirements. People on the other hand, could not rely on complex mathematical systems and password boxes in the past, and for instance messengers would have to deliver their secret messages to anyone who could cough up a password.
-This essentially means that utmost secrecy is the very definition of security. And we will show why this is a flawed reasoning. However, I do not believe there were good alternatives at the time.
For example. Julius Caesar used a method of encryption devised by shifting letters in the alphabet to the right by three letters.[SLIDE] (Show picture, or say, A becomes D, B becomes E and so on).
To describe it mathematically is quite simple, it uses an analogue of the arithmetic a clock uses.
The part of your clock that counts the hours operates in what we call modulo 12 arithmetic. For example 15 is 3, and 18 is 6. Mathematically we say that 18=6 mod 12, which specifically means that 18 and 6 are a multiple of 12 apart. More generally we might say that 26 o clock is also 2, and we would in some strange way be right. Specifically, we would be right in the mathematical sense that 26 and 2 are a multiple of 12 apart. This is the mathematical definition. a =b mod N if they are a multiple of N apart. [SLIDE] If we give each number in the alphabet a number, say A=0,B=1,...,Z=25, then Caesar’s cipher simply encrypts a letter X by doing X+3 mod 26.
2 represents C, and the encrypted value of 2 would be 2+3=5 represented by F.
24 represents Y whose encrypted value would be 24+3 mod 26 = 27 mod 26=1 (by the same reason that 13 is 1 o clock) and 1 is represented by B.
So this was Caesar’s approach. If you encrypt a series of letters, [SLIDE] then it would definitely be unreadable, but certainly not indecipherable, as anyone who knows you use this method can instantly decrypt by shifting back to the left, or mathematically taking X-3 mod 26. [SLIDE]
You could shift an arbitrary number of course rather than 3, but there would still only be a total of 25 different combinations.
And even if it’s not known that this method of encryption is used then the frequency of letters comes to our rescue [SLIDE]. This is what we call the probability distribution of letters in the English language. You can think of it as your mathematicians guide to Hangman.
This allows you to determine that Caesar’s cipher is in use, because if the text you wish to encrypt follows this distribution, then the encrypted text would follow an identical distribution, just with the entire graph moved somewhere to the right, and the ones too far on the right pushing around..
The bottom line: if all our encryption methods were as transparent, we would be in trouble.
But here’s where mathematical analysis starts coming in. This method was improved greatly by mathematicians at least as early as the 16th century. The Vigenere cipher, as it called, features Caesar’s cipher, but with a different shift for each letter as determined by a keyword. Say if your keyword is ‘ADHD’ [SLIDE] then convert this to numbers to get ‘0 3 7 3’, by the same numbering convention as in Caesar’s cipher. We then encrypt each letter separately depending on its position in the text. I have constructed an example here: BOB BOB (where we don’t encrypt spaces)
To encrypt with this keyword we encrypt the first letter by not shifting at all, encrypt the second letters by shifting 3, third letter by shifting 7, fourth by 3, and 5th letter by 0 again, and so on.
Clearly, this creates a whole different picture for people trying to crack it. Instead of just 25 possible values of shifting we now have virtually no limits. By letting our keywords become large.
And additionally, this method of encryption now scrambles the frequency of letters somewhat (as BOB BOB encrypted to BRI EOB so not the same number of letters on each graph) so we have succeeded in making it stronger on at least two accounts.
This cipher was in fact thought to be virtually unbreakable for a long time. You would have to be very skilled to solve this in the 16th century, although this was done. But it was in the 19th century, a reliable method to find the length of the keyword was published. There are actually many ways of doing this now, one is done by looking at repeating pairs and triples in the encrypted text. [SLIDE] Here’s an example of text encrypted by the Vigenere cipher with a keyword of length 5. We see certain patterns repeating, and most of these are in fact a multiple of the keyword length apart. This typically stems from common words in the English language, like AND and THE, occasionally occurring at the same starting point of a keyword cycle. So if your keyword is not too long compared to your text, then you could quite easily determine the length of the keyword by just looking at such combinations. This isn’t the best method for a computer, but this method for finding the keyword length will often work.
Why is it bad that you can learn the length of the keyword? Well if, we, as in this text, discovered that the keyword was 5 letters long, then all letters that are a multiple of 5 apart has a simple Caesar shift, so cracking it reduces finding 5 Caesar shifts. Essentially all you really have to do is to look for the most common letter which will give you the shift value (or equivalently; one letter of the keyword); basically, and again this is the hangman strategy; find the letter E.
Many people have still tried to use this cipher. Not many recently, I hope, but 19th century, in the American Civil War, the confederate states used this cipher. Although, like the cipher, their messages were often cracked.
Also, this cipher is now quite cumbersome to use, compared to caesars, and would in practice be operated using certain wheels to help aid the process. This means that the process of encryption would be known to many people, but the keywords would ideally be kept secret. This is a step in the right direction that we will explain.
In the end;
This illustrates some of the effort that goes in to both creating a cipher, and to break one. And why there is a need for people to do this. Also, mathematicians generally like to challenge each other, to find a stronger method, and to break others. This gives us a stronger definition of security than secrecy (what Caesar relied on): [SLIDE]
Secure cipher = Published and attacked, but not broken. The more attacks the better. By attacks I mean attempts to find general methods to attack a cipher. I mean, this is not a physical object you can just charge with a lance like the middle ages. You would have to actually think of something clever. Something that mathematically is faster than guessing.
If we had relied on keeping methods secret, then we would at worst have computer programmers dangerously conspiring to keep their methods secret, or at best, a wide selection of different encryption methods, only few of which would withstand today’s attacks.
Cryptographic methods used today go through a series of screening tests and trials before they are accepted and standardized. The fact that nobody has managed to successfully attack a 30 year old mathematically simple cipher is a good indicator of its strength.
So when can you trust the internet to provide a secure channel for you?
Simple, our mathematical history is summed up in your web browser [SLIDE]. When the link you are on turns yellow and padlock appears (as shown here in Google Chrome), this shows that your connection uses TransportLayerSecurity as it’s called, and this implements the currently best standards of cryptography behind the curtains of your browser. In practice this means that by going for sites with this you’ll end up favouring big and tested payment systems like google checkout, paypal etc, rather than what some underpaid guy made whilst lurking in his parents’ basement.
The current best standards are really strong. To break a lot of them would require massive breakthroughs in deep areas of mathematics and computer science. On the other hand, security of current ciphers relies on unproven assumptions, whose validities are based on how difficult we think it is to attack them. Prove results along those lines and you’d definitely become famous. In fact the Clay Mathematics Institute offers 1 million dollars for a correct solution to the P=NP problem which is related. But also terribly complex, so we wont go into that here.
So to all prospective students, you could be studying some really cool stuff here. Well, at least if you are the person who considers spending a Saturday night cracking codes to be cool. I guess maybe most people here follow less nocturnal pursuits. At least, officially, while parents are in the room.
Anyway, To sum up, we have talked about what encryption is. We saw variations of Caesar’s cipher,
and why publishing an encryption method is the best course of action.
And there is have the following practical advice. Favour sites that give you a yellow, padlocked URL
That’s it. Thank you.
And, if there are any questions you would like to ask, we have been given some time, and I would be happy to answer what I can.