The Boolean Life
Things are either True or False. Or not.
Much a do about torrents
By Olibe.nu

It is just a protocol used to share files, so what is all the furor about?

Of course, it is more than that. HTTP and FTP are primitive but straight forward. We know who is distributing the file (server) and who is receiving it (client). But straight forward has its price: a larger overhead, centralisation and dependency.

Enter Bittorrent, the new kid on the block. Of course, he is going to get bullied. He will have to convince others by making available some porn. He will be a bit complicated but have less overhead and decentralised and thus a pain to some.

How does the Bittorrent protocol work? The specifics can be gleaned here and here. I will attempt a simpler version.

Say, you have a video you want the world to see. Perhaps, your version of the Gangnam style, 124MB in size. You can FTP it to your server and let all interested parties know the url of the file.

Now everyone knows your awesome dance skills can be gotten from yourserver.com/videos/gangnam-my-style.avi. Let us assume you are really good at dancing the Gangnam Style even more than PSY and some ex-presidents too, what then happens next? 1 billion people want to watch your video. They all go to your link on their browsers and the next thing you know, yourserver.com has to give the same file to 1 billion browsers (several million of them at the same time!). Your bandwidth balloons because what you are sending out is (124 X 1 billion)MB. Not to mention the strain on yourserver.com's processors to handle all those simultaneous downloads! It is a nightmare, just ask JAMB or WAEC.

Version 2. Let us assume you are distributing the video via bittorrent. How do you go about that? You first of all fire up your favourite torrent client (say, utorrent), follow the instructions (RTFM!) and create a new torrent file to distribute your video. You then start seeding from yourserver.com. Let me explain what this process means. In very simple terms, what your torrent client does is this:


  • Go through your 128MB video, dividing into little pieces of, say, 1MB and numbering them.
  • Do some housekeeping on each single piece including hashing (which is a way of telling if that piece is really all there when someone else gets it).
  • Tries to download the video for you. Odd? Don't you already have the file? Yes, you do, so it adds yourserver.com as 1 peer which has all the pieces of the precious, hilarious video.

It then puts all this information and a score of others in a torrent file which you then distribute to mom, dad and your boss at the office (as part of your effort to get fired) and on sites such as kat.ph or the pirate bay.

Mom sees your torrent file, being clueless about your-son-does-the-gangnam-style.torrent, she still clicks on it. Being the good son that you are, you already have utorrent installed for her and have associated it with the .torrent extension, it pops up and starts downloading.

How does it do that? The client opens up the torrent file and tries to get the file, piece by little piece from yourserver.com. It does not get these pieces sequentially as the FTP and HTTP downloads do, it tries to download the most difficult to get first so that the download does not stall when NEPA decides to let every one know that yourserver.com is 'Powered by PHCN'. Mom is the second peer of your torrent

While she is at it, your boss decides to look for another reason to fire that goof-off (being you of course!). He starts with the torrent file you uploaded to kat.ph. When his torrent client tries to get the pieces of the video, it discovers that mom has already got piece 23 (which is the part you run in slow motion to meet the girl with the orange hair), it decides to get that piece from mom instead, therefore saving yourserver.com the pain of delivering 1MB of data.

Still supposing your dancing is not awful (if you've read this far, I seriously doubt it, nerd!) and your boss has not fired you for dancing on company time (you had said you were sick and had to go home!), your video then goes viral with thousands of peers downloading it at the same time. Some tracker keeps track of which peers have all the pieces, which have only piece 23 and so on so that peers can get piece x from whoever in the swarm has it, you now look at your bandwidth meter and NOT scream about how much you are near your monthly limit and then pull the plug.

The downloads still go on - a lot of peers already have 100% of your video (all pieces) and will continue to distribute it to the peers that need it. Is that not what you wanted? World fame?

This then got me thinking. How do I get the famous video almost instantly without waiting for the pieces to be downloaded until I assemble all 100% of the file? That was when it hit me. Each torrent client was downloading piece by piece (not sequentially though) until it had all of the pieces. The client was calculating pieces downloaded by checking only what it (the client) had downloaded. What if 10 clients downloaded same torrent content but calculate pieces downloaded by checking across what all 10 of them had downloaded? Then they would be no duplication of labour and all things being equal (Nigerian sucky bandwidth included), the video would be 100% complete at a tenth of the time (ignoring a lot of other factors, true).

Just pause a little bit and think about it. Just imagine the number of Nigerians downloading the movie Django Unchained over, say, Airtel's network. I do not know the caching solution Airtel runs or if they really run one. But imagine all those Airtel subscribers getting Django Unchained from all over the world, the same y number of pieces downloaded over and over again (face it, Nigerians mostly leech, they hardly seed).

What do you think it will save Airtel to implement some form of torrent caching solution to allow only unique pieces of Django Unchained to come into their network?

You might be thinking, a corporate effort to encourage piracy? Then, replace Django with Ubuntu LTS iso. Or better yet, think beyond such huge files. Think of small ones like facebook.com's or nairaland.com's favicons. How many times do you think Google's logo (not the doodles) is downloaded by Nigerian web browsers?

I know two issues stick out - that of the legitimacy of such a solution and of the management of staleness. Could it be worth the trouble? How much speed or bandwidth for the Nigerian populace would be recovered by such a move? Like Linkin Park sings, it is the little things that give you away.

NB: Of course, utorrent is not bittorrent. I could not find a 'bittorent icon'. I even tried asking Bram Cohen on Twitter!