Archiv für den Monat: Mai 2012

How companies learn which sites you’re visiting and how to prevent it

Today I want to talk about a general issue that affects us all every time we are using the web. That is: the protection of our personal data.

In the olden days when you visited a web site, your browser would connect to the server, receive an html document and render it for you. Today, with Web 2.0 and what not, your browser still receives an html document, but it looks more like a shopping with instructions on where to get the different parts of the site. So when you connect to a certain website, you are in most cases actually connecting to multiple web servers.

This image for example is actually located on a Wikimedia server, not on wordpress.com

Sometimes this is very obvious, for example, if the page contains a Google Maps frame, it is clear that it must be connecting to the Google Maps server. In other cases it is not so clear, because these third party elements don’t even have to be visible: A lot of websites contain hidden elements that send information about the visitors to Google Analytics or other statistic services. If you want to know more about that there is a neat little Chrome extension called Collusion that visualized which website send information to third parties.

Sites with a blue circle are the ones you actually visited the others are third party sites.

Collecting anonymous user data to improve websites or services is one thing.  It gets really creepy once the third party your data is sent to knows your actual name (and address, and favorite animals….). Why would they know that? Well this is where „social widgets“ come in, these small website elements that are currently spreading all over the web:

This thing sends the information, that you are visiting the page it’s on, to Facebook, Google and Twitter.

What happens is this: When you are logged into a service like Facebook or Google (chances are high that you are right now), you have a valid cookie of that service in your browser, so the site will still know who you are, if you leave the page and come back later.
Now when you visit a page with a social widget on it, your browser will send a request to Facebook/Google/etc to get the widget. It will also send your cookie information along. This basically means, that they now know that you, John Doe are currently looking at website XY. Since these widget are on practically every website nowadays the social network services can create detailed personal web usage profiles. It depends on your level on paranoia what you make of this, but it’s certainly not the most comforting of thoughts.

But wait! There is hope! Thankfully, the internet is not (yet) completely controlled by the data collecting juggernauts. There are a lot of plugins, tools and browser extensions that help you protect your privacy in many ways.
In the remainder of this article, I would like to present one example that I find very inspiring: Disconnect.me
The reason why it’s so inspiring is, that one of the founders of this start-up company, Brian Kennish, actually quit his job as a developer at Google, so he could make the „Disconnect“  Chrome plugin that keeps your browser from sending your data to all the major data collectors including Google.

The extension takes only two clicks to install and immediately starts preventing sites from sending your data to Digg, Facebook, Twitter, Google and Yahoo (unless you are on one of their pages). You can easily deactivate and reactivate the blocking of any of the five services. (There exist versions for Firefox and Safari as well).

Now go ahead, try it and enjoy a little more private web experience.

PS: If you would like to learn more about the topic, here is a talk by the above mentioned Brian Kennish:

Thinkpads throttle CPUs down when battery is removed

I was pretty sure that the proper way to treat your laptop battery, if you want a long fulfilling life for it, is to remove it from the laptop when you are connected to an AC adapter. After all the lifespan of a laptop battery (or any rechargeable battery) depends largely on the number of charging cycles. Now if you leave the battery in the computer when it’s connected to the AC adapter, it will drain it just a little bit and then recharge it again over and over. This is true at least for the default setting of Lenovo ThinkPads (as you can read here).

Now let me back up here a little: Today I noticed one of my current programming projects performance dropped significantly over night. I ran a benchmark last night and the same this morning without any major changes, but got much worse results. So I looked for the reason in the code, reverting my recent changes (which should not affect performance at all, but just to be sure). Just before I started to believe in black magic and question my sanity, I thought maybe the reason isn’t in the code after all but my computer is just somehow slower. So I did a little research and finally noticed something odd in the output of cpufreq-info for all 4 cores:

  analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0 1 2 3
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 10.0 us.
  hardware limits: 800 MHz - 2.30 GHz
  available frequency steps: 2.30 GHz, 2.30 GHz, 2.00 GHz, 1.80 GHz, 1.60 GHz, 1.40 GHz, 1.20 GHz, 1000 MHz, 800 MHz
  available cpufreq governors: powersave, ondemand, performance
  current policy: frequency should be within 800 MHz and 800 MHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 800 MHz.

What? „Frequency should be between 800 MHz and 800 MHz?“ That’s a rather tight margin! Of course this would explain the performance drop, but where does this come from? Changing the governor or setting the frequency manually did not change anything. Some web research showed that I was by far not the only one experiencing the problem and that can have multiple reasons. Explanations include kernel bugs and overheating CPUs. I could exclude the overheating and did not want to believe in kernelbugs yet. So I looked on and finally found an issue affecting Lenovo Thinkpads exclusively:

According to this article from ThinkWiki, the included AC adapters 65W are not enough to provide enough power to the system when it is on full load, so the BIOS decides to set the frequency to the lowest value and keep it there, if no battery is found. The article also offers a quick fix to this: Add processor.ignore_ppc=1 to your parameters in /boot/grup/menu.lst and everything will be fine. I tried it and yes, it really fixes the problem, but it’s not supposed to be safe as it doesn’t change the fact that the adapter doesn’t supply enough power. So when on full load the system can crash, I guess. So far it hasn’t for me.

So this leaves me with three options:

  1. Use the fix described above and risk the possibility of crashes.
  2. Leave the battery inserted always and risk decreasing it’s lifetime.
  3. Buy a stronger AC adapter. There are 90W and even more powerful ones available (but the 90W is enough for the T420; see this thread)

I have not decided yet, but am probably going with option three, since the extra adapter costs less than 20€.

Anyway, I’m a little angry at Lenovo now. I mean if they ship an inferior Adapter with my laptop, they could at least tell me. (People in this thread argue, that it is not inferior, because it is smaller and lighter and thus easier to carry.)

Merging multiple images to one document file

Recently I read an interesting article in a newspaper (you know, printed on paper and such…. oldschool). I wanted to show it so some friends and so I scanned it, giving me five jpeg files.

Remember those?

To make the reading more convenient for my friends I wondered if there was a way to merge them into one document file, preferably PDF.

The solution I found is no dirty hack and probably widely know. I however was unaware of it until today and it is so beautifully easy that I wanted to share it with you.

The savior once again is ImageMagicK. If you don’t have it installed, it is available via the extra repo. Then simply execute the command:

convert page1.jpg page2.jpg […] pageN.jpg article.pdf

And that’s it. Isn’t it beautiful?

Btw: This is only one of a million applications for the ImageMagicK package, which is definitely worth having a look at.