Wednesday, November 2, 2011

Encryption/Ciphers Part 3 - Knapsack Cryptosystem (a)

Encryption/Ciphers Part 3 - Knapsack Cryptosystem (a)

Hi everyone. As always, if you haven't done so, I highly recommend atleast skimming through the last two articles, as I use terminology from them in this article. And, I'll be adding more posts about the encryption series, so, if you would like to keep updated, I would suggest you follow this blog.

Also, the Knapsack Cryptosystem will be broken into two parts, one that describes the Knapsack problem and the next that shows Knapsack cryptography.

Also, I made a mistake last time (which, unfortunately, no one noticed early enough), I forgot to do a push on the github repo, so the Vigenere code was not up, well, now it is.

So, lets get started.

The Knapsack Cryptosystem was invented in the late 70's, and now has been broken, but, it still is an awesome introduction to public key cryptography, and is quite a bit more elegant than RSA, which involves a lot more complicated math.

So far, we've talked about a monoalphabetic cipher (the Caesar Cipher), and a polyalphabetic cipher, which both require one piece of information to encrypt and decrypt. Now, we will consider a new type of cryptography, known as a public key cryptosystem. The most simple (in my opinion) example of this type of system is known as the Knapsack Cryptosystem, or Backpack Cryptosystem, or Volumes Cryptosystem, you'll see why these why all these names mean the same thing in just a minute.

Firstly, what's a public key cryptosystem? Well, in such a system, there is not just one piece of information needed to encrypt and decrypt (the shift index in the Caesar Cipher or the passphrase in the Vigenere cihper), there are two things needed, one key to encrypt the plaintext, and another to decrypt the ciphertext. Usually, the two keys are called public keys and private keys. Every recipient in the system has a public key that everyone knows about (including possible attackers), but, messages encrypted with that public key can only be decrypted with the recipient's personal private key, which is kept secret from everyone.

That concept is sometimes hard to bend your mind around, and there's no better way to understand it than an example given by Wikipedia (http://en.wikipedia.org/wiki/Public-key_cryptography), so, I'll just quote it:

"An analogy to public-key encryption is that of a locked mailbox with a mail slot. The mail slot is exposed and accessible to the public; its location (the street address) is in essence the public key. Anyone knowing the street address can go to the door and drop a written message through the slot; however, only the person who possesses the key can open the mailbox and read the message."

Notice I said usually a lot for the definition of public key cryptosystems, because there are other uses known as digital signatures, which aren't the focus of this article (but, maybe later on).

So, getting more particular, what's a Knapsack cryposystem? Well, firstly, we'll look at the Knapsack problem.

Say you have Knapsack (or backpack, or bag, or whatever the heck you call it in your country) of volume $V$, and you have $n$ items of various volumes $a_1, a_2, a_3 ... a_n$, using none or multiple of each of these items, is it possible to completely fill the Knapsack?

Lets make an equation out of that:

$V = a_1x_1 + a_2x_2 + a_3x_3 ... a_nx_n$

This problem, when given $V$ and $\{a_1, a_2, ... , a_n\}$, may have none, one or many solutions. Finding a solution to an arbitrary Knapsack problem is extremely difficult, in fact, all the approaches currently available (more on this later) are not very far from just using bruteforce to try out all $2^n$ combinations. That makes it an ideal candidate to base an encryption system on.

But, due to reasons that you'll find out about, if the equation has multiple solutions, it doesn't work for public key cryptography. So, when does it have a unique solution (if any at all)?

Lets call a sequence of $a_i$ "superincreasing" when each $a_i$ is larger than the sum of all the preceding ones; that is,

$a_i > a_1 + a_2 + a_3 + ... a_{i-1}$

If the $a_i$ are superincreasing, the Knapsack problem has a unique solution (again, if any). Let's look at an example to see why this is true:

Say we have the equation

$11 = 2x_1 + 3x_2+6x_3+13x_4$

Since $13>11$, $x_4 = 0$. So,

$11 = 2x_1 + 3x_2 + 6x_3$

$2+3<6$, so, we must include atleast one item of volume $6$, so $x_3 = 1$.

$5 = 2x_1 + 3x_2$

Again, we operate the same way, noting $2<3$, and then getting $x_1 = 1, x_2 = 1$.

And, we've solved it! A proof of why all Knapsack problems containing $a_i$ that are superincreasing is not difficult to formulate, and I encourage you to try it yourself (consider it an excercise, if you'd like). Also, the Knapsack problem is an excellent introduction to dynamic programming (that's how I learned DP), and, if you care about such things, I would check this out: http://www.es.ele.tue.nl/education/5MC10/Solutions/knapsack.pdf

I'll be posting the next part of the Knapsack cryposystem blog in a day or two. I split it into two parts because the Knapsack problem and the cryptosystem both together are at on of material to have to digest so quickly.

As always, feedback extremely welcome, and thanks for the upvotes, Redditors. If you would like to keep updated with this series, I highly encourage you to follow this blog.

No comments:

Post a Comment