Cryptography: The Science of Making and Breaking Codes


This probably looks like gibberish to you, and it should because it’s a cryptogram, a message in code. But, if I told you that all I did was shift every letter in the sentence to the next one in the alphabet, then you’d know that it translates to this. To encrypt a message, you need two main parts – the cypher and the key The cipher is the set of rules that are using to encode the information
for example shifting the alphabet by certain number of letters The key tells you how to arrange those rules otherwise they’d be the same every time and it would be easy to decode the message in this case the key would be one because
we shifted the alphabet by one letter to decrypt the information you need to know
what kind of cipher was used and also have the key or you can just crack the code either by
trying all possible combinations you can think of or by analyzing the code and
working backward from it, known as deciphering. But is it possible to come
up with a combination of a cipher and key that could never be determined Is there such a thing as an unbreakable code? Well people keep coming up with new and
better ciphers, but it’s hard to make them unbreakable because, no matter what,
you’re using a set of rules to encrypt the information and with enough time and
enough data someone can usually uncover those rules. That puzzle i just gave you is one of the oldest and simplest ways to encrypt
a message. It’s usually called a caesar cipher and in this case the key was just
a number representing how many letters of the alphabet I shifted it but it’s
also very easy to crack. Even if you didn’t know the key it would
take you at most 25 tries to decode the message because you know the whole
alphabet has to be shifted by a certain amount Since there are 26 letters in the
alphabet, there are only 25 other options A caesar cipher is one simple type of
monoalphabetic ciphers a class of
ciphers for the whole code is based on one letter of the alphabet standing in
for another letter consistently throughout the whole message basically
just scramble the alphabet in that case the key would just be a list of which
letters correspond to which like this one There are over 400 septillion possible
ways to encrypt this kind of message so you’d think it would be hard to crack
and it is but only a little bit because there are lots of ways to decode
messages just trying all of the possible keys to a code is probably the most
obvious and least subtle and it has an equally unsubtle name – brute force But you can try a more
sophisticated technique something called frequency analysis, which is based on the
idea that every language has its own specific patterns In English for example
the letter E shows up a lot I used it seven times in just the last
sentence And some words like “the”, are so common that it’s hard to even use a
sentence without them Cryptographers call these words cribs So frequency analysis looks for common words and also common letters or sets of
letters like “ed” or “ing” at the end of words. If you find that the letter X is
showing up a lot in a message and so is the three letter word IRX you might
guess that in the key X corresponds to the letter E and IRX spells THE and
once you’ve figured out those letters you can figure out the rest by
recognizing other words and using the process of elimination And since longer cryptograms contain
more clues they’re easier to crack. So
monoalphabetic ciphers are fun but they’re not hard to break. If you want to
get a little fancier with your encryption you can use polyalphabetic
ciphers instead they’re much more effective. In a polyalphabetic cipher the
way you scramble the alphabet actually changes throughout the message. In the
first word s might translate to w but in the last word s might translate
to h. It all depends on the particular encryption method you’re using and on
your key. One of the earliest polyalphabetic
ciphers was the VIGENERE cipher developed in the 16th century was pretty
simple because the key was just a word. So let’s say you want to encrypt “SciShow is the greatest” using a VIGENERE cipher well the first thing you need to do is
write out a VIGENERE square. The alphabet goes across the top and along
the left side and each row contains the letters A to Z shifted over by one So the first line starts with A and ends
with Z but the second starts with B goes all through the letters until Z and then
ends with A and so on. You end up with 26 differently scrambled alphabets and now
you’re ready to encode the message You just have to pick a key Let’s just say your key is Michael, you
write out your key multiple times until it fills the same number of letters as
your message so “SciShow is the greatest” would correspond to this. Then to encrypt
it, you take each letter of the message and move along its row in the
VIGENERE square until you get to the column of the corresponding letter in
Michael. So “scishow is the greatest” turns into this That’s much tougher to decode unless you
have the key because those letter frequencies are all different now. Since
the keyword Michael is seven letters long each letter of your message is encrypted
using seven different scrambled alphabets. But if your text is long
enough it’s still crackable using a type of frequency analysis developed in the
19th century by cryptographer Charles Babbage Babbage realized
that in a long enough message some patterns and the coded message will
still show up like if your key only has seven letters that means that there are
only seven ways to encrypt the word the but if your message uses the word the
eight times there are definitely going to be repeat. So he just counted how many
letters separated those repeated patterns. If they were separated by 7 14
or 21 letters, he knew that the key was probably seven letters long and from there he would just use
frequency analysis to figure out the seven scrambled alphabet Babbages method
is just one example of why it’s so hard to create an unbreakable cipher. Your key
creates a pattern within the encrypted message And with enough work a spy can
uncover that pattern It turns out that the only way to have a really unbreakable cipher is to use what’s known as a one-time pad encryption, which uses a key that is as long as the message itself. That way there aren’t any patterns in the
encrypted text. There’s nothing to analyze, so there’s no way to work
backwards. The sender and the recipient both have the same pad and each sheet contains a long set of random letters which is used as the key once she is
used to decode a message you destroy it Then you use the next
sheet for the next message so you never repeat a key. As long as you keep the pad safe the
message can’t be decrypted by anyone else. But you can’t always use one time
pad encryption. Let’s say you needed to get a message to
someone halfway across the world whom you’d never met, you wouldn’t have a
chance to give them a matching pad. In warfare that sort of situation comes up
a lot which is why in the early 20th century there was suddenly plenty of
incentive to come up with better ciphers Remote communication using technology
like the telegraph was incredibly valuable during wartime but it was
essential that only your allies understood what you were saying.
The Germans experimented with a new more complicated mono-alphabetic cipher
during World War I but eventually the French managed to
crack it. Then during World War II the Germans again came up with the new cipher And this time its security seemed perfect,
maybe you’ve heard of it The Enigma machine. The machine used a
poly-alphabetic cipher that scrambled the alphabet in a different way each time
you typed a new letter. As far as the Germans knew the only way to decipher the message, was to have your own Enigma machine and set it up using a secret key
that changed every day. The machine was meant to work like a
one-time pad in the sense that the alphabet was re-scrambled for every letter of the message. But instead of having to distribute a set of sheets to everyone you could just use a key that told the users how to set up their Enigma machines And you could change that key is
often as you wanted, but it had a few flaws. For example no letter could be encoded as itself. That might not sound like a big deal but it ended up being a fatal
weakness. British mathematician Alan Turing along
with the rest of his team designed a machine of their own that could crack the Enigma code as long as they knew around 20 of the characters contained in the message – which they often did because some phrases tended to show up a lot in Nazi communicates, especially nice things about Hitler So part of the strategy of Turnings team
was to look for cribs – those common words and phrases – and see where they might fit.
For instance if they knew a message contained the word Führer, they could look
for places in the text that didn’t have the letter F, since they knew that the F
in Führer couldn’t be encoded as itself Those Clues helped them figure out how the Enigma rotors were set up to encrypt the message. Cracking the Enigma code was a
huge advantage for the Allies. Many historians attribute some of the most
important victories during the war to information the Allies got from the
Enigma encrypted messages. These days encryption is mostly important in digital computing, and that isn’t perfect either When websites announced that hackers
now know everything about you that’s because their encryption methods were breakable Companies that store your data have to take into account a whole new set of considerations Like how when you can complete billions of operations per second brute force suddenly becomes a lot more practical. So the same principles that Vigenere and Turing used are the same ones that allow you to pay your bills online and keep North Korea out of your email, most of the time. But HOW, is a story for another episode Thank you for watching this episode of scishow, which was brought to you by our patrons on patreon. If you want to help us keep making
videos like this check out patreon.com/scishow And don’t forget to go to
youtube.com/scishow and subscribe.