Leaving Google Behind: Progress Report

Google, I’m Leaving You.

Somewhere over five years ago, I gratefully accepted an invite to Gmail and rejoiced: it was a wonderful new paradigm in web-based email, and a huge improvement over Yahoo Mail. It’s still one of the best email services online, and still miles ahead of the nearest competition by number of users.

At the time, it was a straightforward social contract; Google would host and provide a great email service, and in exchange, non-human agents (robots!) would scan email in real-time for keywords, and provide ads in real time based on their inferences. This, I thought, and still feel, is pretty fair for such a great free service.

Somewhere along the line, the contract was compromised in innumerable ways. Firstly (but not by importance to me) it seems the “in real time” part is gone. That is, the comfort of knowing (or thinking) that results of algorithmic scanning were not stored or logged, is now gone. It’s generally accepted that Gmail is part of a greater profile-building apparatus built into the google account suite, and as such some content of my private life is entering the public sphere and being sold or revealed to people I don’t know or trust.

More importantly perhaps than Google’s slow abandonment of its “don’t be evil” mantra is the increasing invasiveness of the American Government’s “Be as Evil as Possible” policy. Google provides largely unfettered access to user data and accounts to the various gestapo agencies of the US intelligence and law enforcement apparatus, who form their own profiles on people. There is a mountain of evidence that due process is often ignored and there is more often than not no legally relevant reason behind invasions of this sort; anywhere from casual curiousity to “watch this dissident” reasonings can be applied under the PATRIOT act and its cousins, when the law is invoked at all. Worst of all, Google don’t notify account holders of these invasions even when they are legally capable of doing so.

I don’t know about you, but I am not too happy about having faceless agents from the world’s biggest kidnapping agency reading through my email. It’s not a matter of “I’ve got something to hide”, the most tired straw-man in the privacy-hostile person’s arsenal. If you ask someone whether they’d happily omit the envelope on their snail-mail, even if there’s nothing illegal inside, most people might balk; why let all the guys in between read my soppy I-love-you-mum letter? And yet that’s what we routinely do these days with email and social networks.

Count me out. There’s no reason why I can’t enjoy all the fruits of modern internetting without sacrificing a bit of myself to the police state.

So, I’m making a transition away from Google and toward personal email hosting. It’s going to be an interesting experiment, and I’m not going to dive into the deep end immediately with something so important. The first step is getting all my data from Google so I can safely archive it; that’s several gigabytes of email and attachments, so it’s taking a while. Here’s how it’s going so far.

Leaving Gmail with Archives Intact

So far, getting my email has been the hard part. After the continuation of the infamous “nymwars” debacle on Google+, I decided to ditch that service; at least with “Google Takeout” it was easy to back up all the content I’d put up on that service before hollowing out the profile.

However, it’s hard to be sure that suspending Google+ won’t cripple or ruin the rest of the account; after all, “name violations” on Google+ have lead to people losing access to their entire Gmail account, and the “Delete my profile” apparatus doesn’t make it clear or certain that my general account will be spared.

Unfortunately my Email is sort of a personal archive or cloud-storage thing for me, so backing it up is important but also awkward. I decided to go down a trustworthy hacker-friendly command-line route, because I’m a nerd like that, but I’m starting out with the easiest solution: Thunderbird. Using the Mozilla Foundation’s Open-Source email client, I’m downloading all of my email and using the filtering system in Thunderbird to apply yearly archiving tags to my email. Oddly enough, I’m doing this because the built-in search engine in Gmail seems to be broken (of all companies to botch a search feature..) and won’t let me search/label by date no matter which format I use.

Once I have all of my email reliably labelled by year, I’ll be using “Getmail” to download the email year-by-year. Getmail allows you to save email either as a “maildir” (a set of folders full of individual files for each email) or as a giant file containing everything. I’ll be going with the former. There’s a great writeup on how to use getmail: be sure to read the whole article and the comments if you’re patient enough, because there’s lots of pro-tips and debugging stuff there.

One odd pitfall I hit was in Contact Export/Import: Gmail can export all contacts as a “Comma Separated Values” file, which is great. However, three things happen when you try to import to Thunderbird:

  1. Not all of Thunderbird’s potential fields match the output (Thunderbird has no “Middle Name” field, for example, while Gmail uses it liberally), leaving you with a soup of potential assignments of key data, few of which are perfect.
  2. The inteface to actually match value-to-value is awful; one list can have items shuffled, but because items shove each other down the list as they are moved you can only reasonably do this from the top-down of the other column. As mentioned above, not all potential fields match, and there are oodles of redundant fields, forcing you to “plug” gaps (that is, stupid fields in between fields you actually want to import) with matching fields that you’re not going to use.
  3. When you actually import contacts, all name information is (if matched correctly) neatly stored in each contact, and then ignored when it comes to providing an actual name in the contacts list. Instead, the contacts window just axes off everything after the “@” symbol in the email provided, and uses that as a name. Mind-numbing stupidity.

To remedy this stupidity, I opened the .csv file in LibreOffice and moved around data that couldn’t import correctly (I merged “middle name” into either First or Last name as appropriate, which was labour intensive), deleted all empty columns, moved miscellaneous data into “notes” column, and finally I copied the “First Name” column twice; the two copies were named “Nickname” and “Display Name”, and were imported to Thunderbird as same. Since Thunderbird allows you to display “nickname” and sort by that, I was able to display at least the first names of everyone in the Contacts list. Victory! Remember to save that hacked .csv file so you can import it into other instances of Thunderbird or similar at a later date.

Once I’ve got all my email and all my attachments safely downloaded, I’ll be purging my entire account up until the last few months, and that’ll be “stage 1” complete in my mind. I’m planning to archive all of the past email data in a Truecrypt file which I can keep safe by redundancy (i.e. copying to CDs etc) without worrying about it falling into snooping hands.

When I get my next Email set up and running, I’ll set up a Gmail redirect and autoreply to inform people of the switch, and begin the migration. People imagine email migration to be extremely difficult, but I’ve done it a few times; in reality, most of the people who actually matter will email you at least once a season, and they’ll quickly change the email they use when they get autoreply’d a few times.

What Then?

Leaving Google might seem a drastic move.. indeed, I’m not actually planning to delete the entire account. After all, the Android Marketplace regrettably requires a Google account, and Google Wallet is pretty handy too. For viewing shared documents on Google Docs I’ll need an account too. However, Google will no longer be a central part of my internet experience.

Indeed, I’m generally going to be trying to keep my online behaviour for now on as close to the chest (i.e. Not In America) as I can without making compromises on my mobility and user power. With the amazing software that’s available in the Open Source sphere, I can start hosting a lot of the sort of services I used to rely on Google or similar for, using my own hardware.

To avoid search bubbling and search tracking, I’ll be switching to the far richer and more user-friendly DuckDuckGo.com. More broadly, I’ve lately been thinking that the death of links-pages and webrings was a dangerous dependence-inducing mistake for online culture, but that discussion is for another blog post. There are hints of croudsourced-webcrawling search engines in the works here and there, which would be very interesting if true; yet another potentialy application of idle processor time for net users worldwide would be to help aggregate a map of the internet. More interesting still, perhaps, would be to crowdsource surfing data, anonymised and aggregated from thousands to millions of users, to form a map of the web with keywords and surfing associations intact for indexing. But, that’s not my job or immediate concern as long as I can find stuff with good accuracy, minimal algorithmic interference (“Hey, you’re from Cork and you like Open Source Stuff, why don’t I just omit key results to make you happier?”) and in good time.

Hosting

As much as I love my current web-host (ixwebhosting.com - you’ll like them if you’re in the market for a personal website, I promise!), I am soon going to investigate local alternatives for Domain Name hosting and online storage space for my sites. This isn’t simply because Ix are an American company (although that figures in), it’s also because I want to upgrade to a service that gives me command-line access to a virtual machine, to host services like OwnCloud or Diaspora that need more intensive attention on the setup side of things.

Storage/Documents

I’m planning to get OwnCloud running on my own personal server and host it online through a dedicated domain name or alias of cunningprojects. OwnCloud is slated to include a document editor which might nicely replace Google Docs, already has a built-in music player, and can be synchronised with folders on my computers or Android devices to perfectly mimic the functionality of Dropbox. It’s also got a really pretty web interface, and I’ll be able to give friends and family their own accounts if they want, too. If it’s not enough for Document management, I’ll be waiting eagerly for a plugin to fill that gap.

Social

I’ve already moved to Diaspora*, and I invite anyone who’d like to connect with me there to do so. I can’t guarantee a follow-back, but that doesn’t mean we’re not friends; just that we don’t necessarily share online interests! When Diaspora provide functionality for account-migration, I may decide to join a local pod, perhaps one hosted at the local Hackerspace in Cork. Also, I’m staying with Twitter for now. For one thing, their Corporate Culture hasn’t soured yet, and they seem to do the right thing generally; they alert users to government prying (or did once, at any rate), they tread carefully around marketing by labelling it opaquely, etc. Main reason I’m staying with Twitter for now is simply that Twitter is for things I don’t mind shouting aloud for all to hear, so USA prying into my account is unlikely to yield anything that would bother me if revealed. I will be recommending that friends/contacts stop PMing me, however, and use email instead.

Email

The main event, as it were, is Email. Initially I will probably not be switching entirely to local email hosting on my own computer; there’s a minefield that I must become acquainted with when it comes to single-user email hosting because of the complex web of anti-spam out there. Essentially, I’m concerned that without the vouchsafing of Google or a similarly huge organisation, my email may end up filtered by default by most recipients. However, if that could be easily avoided, then I’d love to try hosting my own email server and expanding it into a rich personal service using Open Source webware. With RoundCube, I could have a pretty and reliable webmail interface, and with IMAP support I can continue to use email on my phone with trivial ease. For built-in-chat functionality, you can actually continue to use Google Chat using any chat client, and I’m certain there’s a pretty Open Source webchat client I can use, too.