Not Saussure

May 28, 2007

Google, privacy and a techie note

Filed under: civil liberties, Internet — notsaussure @ 11:25 am

Via Archrights and Longrider, a Financial Times story from last week about

Google’s ambition to maximise the personal information it holds on users is so great that the search engine envisages a day when it can tell people what jobs to take and how they might spend their days off.

Longrider is quite sanguine about this development, as is the FT editorial; the service is an optional one, they both argue, and if you don’t feel the need to ask Google for suggestions about what to do next weekend or what job you should take, then there’s no need to sign up for it.

Fair enough, though the FT does add the — to my mind, necessary — caveat that

The underlying principle must be informed consent. This means that information should be used only for the purpose for which it was gathered. In general, this should mean that it is not handed over to another organisation without the user’s express say-so. Even if this stance cannot always be maintained – for example, if a government demands information at the time of a security crackdown – then the risk that the data may be passed on in certain circumstances must be made explicit.

I’m always slightly suspicious of data mining for pretty much the same reasons as is the FT. There’s a surprising amount that can be deduced about you from things like supermarket loyalty cards — I’ve worked for a company, albeit on another project, that helps build the software that analyses such data for the supermarkets’ marketing departments, so I know a bit about this — and you’d be astonished both at how accurate the predictions turn out to be, at least based on take-up of personalised offers and vouchers issued on the basis of analysis of your purchasing habits, and you’d also be astonished at the third parties to whom this information is sometimes sold, and what they can do with it.

I mean, if you were a life insurance company wanting to offer the most competitive rates, wouldn’t you be interested in finding out as much as you can about the eating, smoking and drinking habits of a potential customer? I’m not sure if anyone does that — I’d be a bit surprised, though, if an individual supermarket’s marketing department and its financial services department (if it has one) don’t share information. And certainly HM Revenue & Customs have the power — which they use — to check on people’s spending; if you’re a self-employed painter and decorator who’s being a bit remiss with your VAT, I really wouldn’t advise using a loyalty card at one of the big DIY stores, for example.

Anyway, as I say, I’m a bit distrustful of data retention on principle, and this extends to data retained by search engines. People will recall, no doubt, the embarrassment caused to both AOL and its customers — rather more embarrassing for the customers, I think — when, last August,

AOL’s publication of the search histories of more than 650,000 of its users […] yielded more than just one of the year’s bigger privacy scandals.The 21 million search queries also have exposed an innumerable number of life stories ranging from the mundane to the illicit and bizarre.

While users weren’t identified by name, they were given unique user numbers, so, for example, you can find out that

Based on the number of local searches, AOL user 1515830 appears to be a resident of Ohio’s Mahoning County.

and I’m willing to bet that, when she was conducting various searches on March 9 of last year, AOL user 1515830 didn’t expect them to be made public and was justifiably furious when they were (read the CNET story and see if you don’t agree).

For a bit of light relief, turn to the explanations of some of his Google searches the inestimable Jon Swift thought it necessary to provide when he discovered in January 2006 that Google

fighting a subpoena from the Bush Administration to turn over its data on searches in order to defend the Internet Child Protection Act. Of course, I support whatever the Bush Administration thinks it needs to do to protect children from the Internet and think Google should surrender this data immediately. However, I was looking at the record of Google searches I have done and am worried that there might be some misunderstandings when these searches are seen out of context. So in case Google does lose its case, I would like to take this opportunity to explain some of the searches I did so that no one in the Justice Department gets the wrong idea. As you can see there are innocent explanations for all of them

My worries about this sort of thing were hardly assuaged when I read in today’s Register that

Google has faced down one European probe into what it does with people’s personal information, only to be challenged with another.Last October, privacy watchdogs in Norway, which is not part of the European Union but has identical data protection laws, asked Google to justify why it retains people’s search histories for up to two years. Google refused to co-operate.

Now the Article 29 Working Party, which advises the Justice Directorate of the EC, has asked Google to bring its business practices into line with European data protection law so that it gives due respect to people’s privacy.

The article continues,

The Register understands that Google has been the cause of anxiety among members of the A29 Working Party for some years. Their members, who include representatives of national EU privacy watchdogs, are not pleased about how long it keeps information. The Norwegians were also concerned that Google might be using its data stores to create profiles of people’s lives. This was one question Google refused to answer.Leif Aanensen, deputy director general of the Norwegian Office of the Data Inspectorate, told The Register that it had effectively put its Google probe on ice after the data giant refused to accept that it came under Norwegian jurisdiction.

“We are not satisfied,” he said. “We didn’t get the proper answers.”

“Our main issue was their data retention policy and the use of the data they stored. We asked them what they were doing with the personal data – are you creating profiles – they didn’t answer,” he said.

Anyway, if, like me, you are a bit concerned about this sort of thing, you might like to know there’s a Firefox add-on called TrackMeNot. As the project’s home page explains, along with a lot of rather disturbing background about what the US government is doing, or trying to do, with search engine queries,

TrackMeNot runs in Firefox as a low-priority background process that periodically issues randomized search-queries to popular search engines, e.g., AOL, Yahoo!, Google, and MSN. It hides users’ actual search trails in a cloud of ‘ghost’ queries, significantly increasing the difficulty of aggregating such data into accurate or identifying user profiles.

Advertisements

6 Comments »

  1. Actually, I share your concerns about data mining in general. However, I make a point of being circumspect about what I share and with whom. For that reason, I don’t have a loyalty card, regularly clear out my cookies, have track me not installed and either don’t do surveys or if I’m feeling mischievous, lie to them.

    The difference being – I can do this with impunity. The government enforcing such data gathering is another matter entirely…

    Comment by Longrider — May 28, 2007 @ 12:05 pm

  2. Sure; I rather assumed that was the case. A lot of people, though, just don’t realise the implications letting their data be stored, and I must admit even I was a bit surprised that ‘don’t be evil’ Google are so unwilling to comment to the Norwegians and the EU on whether or not they’re using searches — as opposed to that opt-in service you and the FT were talking about — to profile people and are, apparently, so unwilling to comply with local data protection laws.

    ‘We don’t accept Norwegian jurisdiction’ indeed! They accept Chinese jurisdiction readily enough, though China is, of course, a far larger market than is Norway.

    That’s why I thought giving TrackMeNot a plug was worthwhile.

    Comment by notsaussure — May 28, 2007 @ 12:22 pm

  3. Perhaps the Article 29 Working Party needs to also turn its attention to Tesco’s, who currently run the worlds largest people profiling database.

    Tesco have refused to comply with the UK Data Protection Act, and also refuses to respond to personal DPA requests for disclosure on what information they hold.

    Tesco DOES share its information with other commercial organisations such as Orange and BSkyB as well as the HMRC.

    Comment by IanP — May 28, 2007 @ 12:59 pm

  4. See information re Tesco’s database here.

    http://business.guardian.co.uk/story/0,3604,1573821,00.html

    Comment by IanP — May 28, 2007 @ 1:01 pm

  5. I’m surprised that the UK’s biggest supermarket doesn’t just ask me what I would like. That seems simpler.

    For a start, I like straight money back. The mucking about with trying to translate tokens in to other deals worked for a trip to Chessington theme park, but getting it together was a fag and I wouldn’t want to repeat it with all the exemptions and exclusions which are in the small print.

    Secondly, I don’t like these obscure products I keep being offered fairy points for. If I had wanted the product, I would have picked it up. All the points do is to make me feel vaguely annoyed and guilty that I can’t match them to the product. It is a puzzle I don’t understand. They do not delight me. They irritate me.

    Thirdly, a proper analysis of my shopping would show that I do what every other indecisive person does. We eat what ever is on yellow sticker. Don’t care what that is so long as I do not have to think about it.

    The only time I came seriously unstuck on this procedure was with some stuff called dulse, which is French seaweed and seemed to be a healthy snack-treat for 10p. It stank and was as chewy as fisherman’s socks. Even I couldn’t eat that, although on reflection I should perhaps a have cooked it or used it in a footbath.

    Just put what ever you want to shift on yellow sticker or bogof, and I’ll probably have a punt on it. I don’t even read half of what the computer reels out; I’m too busy trying to remember to recyle carrier bags.

    Oh, wait. I’ve just had a thought. Is the random logic of the yellow sticker why I keep being offered fairy points for buying tinned yak hooves and anchovy-lined flip-flops?

    But the seriousness of your piece was about the dissemination of personal data. At present, anybody who has a child already has their data threaded through umpteen government systems because sending child data automatically encompasses data about the parent.

    The NHS central files (if it ever comes off) will increase this so the really difficult things – the abortions, the breakdowns, the life expectancy – will be more or less public in perpetuity. Add that to transport data, employment data and the three or four registers for people convicted of certain crimes, or suspected of certain crimes, or just PNG, and the land registry, electoral register and council information, the census, plus anybody who knows me such as my hairdresser, and I’m asking: ‘What privacy would that be?’ It has already gone.

    That google knows my searches is almost irrelevant; any agent of the state who wishes to cook up a case against me can probably do it from the state data alone. Any less privacy and I’d actually be famous.

    What does bother me is asymmetry of data. In the recent MTAS debacle, junior doctors were annoyed at their personal data being open to every prying busybody.

    I do sympathize, but welcome to my world.

    Comment by V Samuel — May 28, 2007 @ 3:03 pm

  6. Hmmm… I’m not fond of the idea of cluttering the internets with more worthless data pinging back and forth. Why not use a proxy such as http://www.blackboxsearch.com – they proxy your searches anonymously and don’t collect any tracking data whatsoever.

    That, however, only takes care of data mining of your searches for financial gain. Matt Cutts (search guru, Google employee) makes a convincing case(*) that your ISP will have a more detailed picture of your online activities than the online searches queries you have made, plus a far greater possibility of positive identification / address etc. And so it is that your ISP will be the first port of call prior to law enforement battering down your door.

    * http://www.mattcutts.com/blog/google-and-privacy/

    Comment by countd — May 29, 2007 @ 8:59 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: