Thursday, August 6, 2009

Three Applications of High, Abstract Math

As a math teacher, I'm always wishing I had more free time with my students, where I could show them some of the advanced real-world applications of higher math. I think students get bored with a lot of the "standard" application examples-- measuring bridges, designing skyscrapers; even sending a spaceship to the moon, which we pulled off in the 60's/70's, is old news to us younger generations. My intent is to write about a couple of applications which are really modern and surprising. Since math has been rightly called "the one reality escape which actually works", its examples should include something a little more "sci-fi" than calculating where mortar shells will land :)


Making CDs Scratch-Resistant

Ever notice how you can scratch the heck out of a CD or DVD and it still plays perfectly? Isn't that a little weird? If you tore up a book or scratched an old vinyl record, the contents would be damaged. Modern discs employ a branch of high math called Coding Theory. This discipline is often described as the study of "how to communicate clearly over a noisy channel". In the CD/DVD example, "noise" means "scratches on the disc". The same techniques are also applied to cell phones, wi-fi, satellite communications, and so on-- where "noise" might literally mean noise.

To illustrate how one would use a "code" to make information safer, I'll use one of the easiest, most intuitive codes to show how the concept works. In actual practice, much more sophisticated codes are used, because the one I'm exhibiting isn't very inefficient. OK, so you're inventing CDs and you want to make them scratch-resistant. If you just record data onto the disc directly, one tiny scratch will equal one error, and that's no good. So instead, you record every bit of data three times in a row. Thus, if the CD is to contain the data "110", it actually becomes "111111000"-- each digit has been tripled. A perfect disc should thus be made up of lots of blocks of 3 identical digits. But suppose the disc is scratched, and one digit is switched from a 0 to a 1. Then its block becomes, say, "010" instead of "000". The CD-player can use the "best two out of three" and determine that "010" means "0". Congratulations, your CD is a little more scratch-resistant!

It's still not perfect. If two digits are messed up in the same block, the "best two out of three" scheme will give the wrong result. You can make the code more robust by repeating everything more-- for example, repeat every digit four times, or five times, or a hundred times. Forget "scratch resistant", now we're talking more like "shotgun resistant"! The problem is, repeating every digit a bunch of times is gonna take lots of space, reducing the amount of megabytes that can be stored. This is where advanced math comes in: highly sophisticated coding methods are carefully devised, to achieve incredible scratch-resistance with very little extra storage space consumed.


Calculating Search-Engine Results

In the early days of the internet, search engine results were clogged stiff with spam and junk. They used naive methods like: find which webpage contains a word the most times, and put it as search result #1. Obviously this was wide open to abuse, and it wasn't long before webmasters were competing to see how many times they could repeat their target keyword in one page. Then a simple idea revolutionized it all: use hyperlinks as "votes", so when one webpage links to another, it counts as a "vote" for the latter, boosting the target webpage in search-engine results. Furthermore, the votes of more important webpages should count more: a link from the White House means more than a link from some random chatroom, for example.

This seems easy enough, but the problem is determining which webpages are "more important", for the sake of counting their votes. After all, going through all the websites in the world and classifying their importance is just as difficult as going through them and sorting them for a search index-- in a certain sense, the two problems are the same. The answer is to automate that by-- you guessed it-- using hyperlinks as votes. The problem seems circular. We need to rank content for importance in order to count its votes, but we need to count its votes in order to rank it for importance!

Enter the branch of math called "Linear Algebra", which deals with two things: the matrix and the vector. A vector is simply a list of numbers, and a matrix is a rectangular block of numbers. What does this have to do with combing the information superhighway?? First, a "list of numbers" is the perfect container for storing the ranks of URLs-- one number per URL. Second, a "block of numbers" is perfect for storing data about who links to whom-- if site X links to site Y, record it in the number where column X meets row Y. Finally, there's a natural way to multiply a matrix and a vector, getting a vector output which perfectly captures what would happen if already-ranked websites held an election to re-rank each other. Now the solution to the original circular problem goes like this: "How could you pre-rank the websites so that when they hold their election, their ranks stay the same?" In algebra notation, the equation looks like Ax=x, where A is the matrix of who links to who, and x is the rank-vector. Solve for x, and you're halfway to being a $6 billion mega-corporation! (Unfortunately, x usually can't be solved for, so the equation must be modified a little, but that's just technicalities and the basic idea is the same)


Modern Encryption

Old-school encryption basically relied on the sender and receiver sharing some sort of secret information, referred to by cryptologists as a "key". When kids make a simple letter-replacement chart, the chart is the key; when Nazi Germany used their Enigma machines, the initial machine settings were the key. The commonality between the kids' chart and the advanced German war-typewriters was that both the sender and the receiver used the same secret knowledge. Thus, to decrypt messages using this system, you need only as much information as it takes to encrypt them. And that's not hard when you command a vast army and there are hundreds of capturable people on the front lines with the knowledge.

It took some higher mathematics to evolve ciphers to the next level. The conceptual breakthrough was to use two different keys, a public key and a private key. The public key is needed to encrypt messages, but can't be used to decrypt them; thus you don't need to protect it at all. You can publish it in the newspapers-- it won't do enemy intelligence a bit of good. To actually decrypt a message, you need an additional secret key. Since this secret knowledge isn't necessary for the original encrypting, there's no need to teach it to grunts on the front lines (if they need to recieve secret messages, you can give them a separate key which only that unit uses, so no compromise if the enemy captures it). That's the concept, at least; but how to actually implement it?

That's where advanced math comes in. In a public key-private key encryption system, there must necessarily be some intertwining relationship between the two keys: otherwise, if they were totally unrelated, we couldn't hope to use our private keys to decipher messages encrypted with the public keys. It would be like jamming your carkey into someone else's car. Now, one of the most elementary applications of mathematics, is to solving riddles of the form, "A and B are related, now given A, solve for B." If this task isn't extremely hard, then the enemy will just hire some mathematicians and they'll use the public key to figure out the private key, and all is lost. Thus, math serves as a kind of measuring stick for good public-private-key systems. One quantity should easily garble a message, another quantity should easily restore it, but the problem of solving for the latter in terms of the former should be as hard as you can possibly make it. Fortunately for cryptologists, number theorists have been scouring the earth for really freakin' hard math problems for centuries. One particular problem is perfectly suited for the task: factoring huge numbers into primes. Take two huge primes (hundreds of digits long) and multiply them together-- maybe hard to do by hand, but easy with a computer. Now you can announce the product of those two primes, but keep the original factors secret. An enemy will have a far more difficult time undoing your arithmetic: a hundred computers might struggle for a decade and still not figure out what those two original primes were! But with a computer no stronger than a cellphone, your correspondents can use your published multiplication result (along with some other technical stuff) to scramble their messages, and cracking the code basically amounts to factoring the huge monster you published.

An advantage to mathematical encryption methods is, you can be at ease knowing that breaking your security is "hard" not just for some naive intelligence agency, but indeed for the mathematical community at large. You don't need to worry about some pesky child genius and his adorable pet dog figuring out your code, unless that child is such a genius that he can singlehandedly revolutionize the collective knowledge of millenia of devoted mathematicians. In that case, he could probably just invent a time machine and go kill you when you were a baby, so it's not even worth considering such outlying possibilities.


For More Detailed Information...

If you want more detailed information about the three examples, here's what to google. The codes used in CD's are called "Reed-Solomon Codes" and there's a ton of details all over the web about them. If you're fascinated with the mathematics behind search engines, do a search for the "Hyperlink Matrix", just remember, there is no spoon. And if you'd like to use the techniques from the latter part of the article to hide your world domination plans from prying eyes, look up "RSA" in Wikipedia, this is the most standard system in use as of 2009, and your browser probably used it if you signed into any "https" sites recently.


FURTHER READING

The Higher Infinite
Five Ways To Be Better At Math
Rote Memorization in Mathematics
How I Taught Myself Calculus

1 comments:

Emilio Wuerges said...

You forgot the most important!
God of War!
Check this blog: http://realtimecollisiondetection.net/blog/

 
Privacy Policy