29 January

The Key to Snapchat’s Profitability: It’s Dirt Cheap to Run

Note: this article was published in Wired. Check out my handy service storage and bandwidth calculator.

Ever since Snapchat turned down a $3 billion all-cash offer from Facebook this past November, there’s been no shortage of discussion about it and the rest of its photo-sharing-and-messaging service cohort, including WhatsApp, Kik, Japan-based LINE, China-based WeChat, and Korea-based Kakao Talk. Explanations for this phenomenon have ranged from the need to redefine identity for the social-mobile era to the rise of ephemeral, disposable media.

Regardless of why this trend is taking off, however, it’s clear that the so-called messaging “wars” are heating up. As always, the euphoria over “hockey-stick” user growth numbers are beginning to give way to the sobriety of analysis, yielding the inevitable question: can they monetize? Snapchat, with its massive (paper) valuation is at the vanguard of such criticism, especially given the irony that the service is essentially deleting its biggest asset.

So, how can Snapchat effectively monetize without its user data? By operating its service an order of magnitude cheaper than its competitors.

Surprisingly little time has been spent examining how one can rethink a storage-centric infrastructure model for disappearing data. This isn’t just relevant to engineers; it has important implications for helping services like Snapchat save — and therefore make — money. (By the way, that would need to be about $500 million revenue and $200 million profit to justify its $3 billion valuation in November.)

It’s very simple: If the appeal of services like SnapChat is in the photos (“the fuel that social networks run on”), then the costs are in operating that photo sharing-and-serving service, as well as of running any monetization — such as ads — that will be built on top of that infrastructure. But I’d even go so far to argue that making use of advanced infrastructure protocols could let Snapchat get away with paying almost no bandwidth costs for a large subset of media.

How? Well, let’s begin by comparing Snapchat’s infrastructure to that of a more traditional social network: its erstwhile suitor, Facebook.

According to publicly available data, Facebook users upload 350 million images a day. Back when users were adding 220 million photos weekly in 2009, the company was serving upwards of 550,000 images per second at peak — and they did it by storing five copies of each image, downsampled to various levels, in a photo storing-and-serving infrastructure called Haystack. (For obvious reasons, the exact architecture of these systems is not known.)

That gives you a sense of the scope of the infrastructure. But the salient detail here is the total cost of this serving-and-storage — including all-in per-byte cost of bandwidth — which I estimate to be more than $400 million a year.

If you want the details or to play around on your own, here’s a handy service storage and bandwidth calculator.  As a quick summary, here’s what went into my calculation, which also includes ancillary costs such as power, capital for servers, human maintenance, and redundancy. The most important variables in my cost calculation are:

  • the number of images/videos uploaded each month (estimated at ~ 400M photos daily)

  • the size of each image/video (estimated at 3MB)

  • the average number of images/videos served each month (estimated at 9.5% of all images)

  • all-in per-byte bandwidth/serving cost (estimated at $5*10-11)

  • all-in per-byte storage cost (estimated at $5*10-11)

  • exponential growth rate coefficient (r, estimated at ~ 0.076, using Pt = P0ert).

To compare Facebook’s costs to Snapchat’s, however, we also have to include these variables: the mean number of recipients of each Snapchat message (estimated very conservatively at 2.5); and the fraction of total messages that are undelivered (estimated at 10%).

Obviously, we are comparing a much larger service that has advertising — Facebook — to one that is much smaller in scope and doesn’t have any advertising (yet). But I’d argue that this doesn’t really matter, in principle. Because even though Facebook has to make sure its infrastructure can store and serve the data needed to sell ads, the reality is that much of the information that helps advertisers target users is the metadata of user interactions — with whom, where, how, and when (as well as what they ‘like’) — as opposed to the content of what those users are actually saying.

This means that despite their differences, storing and analyzing only the metadata would still allow Snapchat to build similar profiles of its users. This would allow Snapchat to sell ads that target users just as Facebook does (assuming of course that their product can attract a consistent customer base) — and with one huge advantage: lower costs, since Snapchat doesn’t need to store or serve any messages after they’ve been delivered.

This kind of approach to user targeting, with its metadata-centric infrastructure and associated cost savings — is by no means unique to Snapchat. The public revelations about NSA’s surveillance operations point to a similar architecture: Storing the entire content of all intercepted communication would be prohibitive in terms of cost and space, but not so for metadata. In fact, the way the metadata is (theoretically) used to target whatever individuals and groups NSA agents deem to be a threat is not dissimilar to how advertising targeting works. But that’s a separate concern.

What makes Facebook’s (and any other traditional social network’s) photo-serving costs so expensive is having to keep data in a high-availability, low-latency, redundant, multi-master data store that can withstand temporary spikes in traffic load. But much of this expense is unnecessary for storing and processing metadata. Based on some additional assumptions (such as the number of recipients of each message), we can estimate that, even if their per-byte storage costs were 5x higher, Snapchat would only need to pay $35 million a year (under 9% of Facebook’s total estimated infrastructure costs) to handle a similar load — all while accruing a trove of data with similar targeting value.

It’s like getting a mile when you’re only giving an inch.

How could Snapchat reduce their bandwidth and storage costs even further? The key, again, is in the seemingly mundane: infrastructure. There are a number of more complicated optimizations that could make the system even cheaper to operate. For example, Snapchats between parties that are concurrently online could be delivered via peer-to-peer messaging (think Skype). Because these messages would never even flow over Snapchat’s network, it would reduce Snapchat’s delivery costs to nearly nothing. Firewalls are an impediment, of course, but a number of solutions, including proxy servers in the edge of the network, or ICE (RFC 5245) could make this doable relatively soon.  Snapchat could even store encrypted, undelivered messages on other users’ phones, ensuring availability by using erasure coding with sufficient redundancy.  (This means that they could split your media up into many overlapping pieces, only a few of which are needed to reconstitute the entire picture/movie.  Each piece would be given to a different user, encrypted so that no one other than the recipient would be able to glean any information about the data, and so that with high probability, enough users will be online at any time to reconstruct the data.) While it’s hard to guess what fraction of messages are exchanged between parties that are online, the impact of such an infrastructure design would definitely be substantial.

Despite not having to store and service large bits of content, a new generation of messaging services are emerging that can use cost-effective infrastructure operate an order of magnitude more cheaply than the Facebooks of the world.  By storing only the metadata of interactions, they can effectively target users and monetize these systems.  The only questions that remains is whether they can make a compelling enough product to keep users coming back for more.

29 January

Service storage and bandwidth cost calculator

This page allows you to simulate the costs of storage and bandwith for a simple web service like Facebook or Snapchat. For more info on why this might be interesting, please read my post about infrastructure in ephemeral networks.

The defaults are based on data for Facebook based on a number of publicly available data. The system is assumed to consist of a set of people that upload media (photos, for this analysis). In order to model growth, I assume that the number of items uploaded grows according to an exponential function. The default values have been fit using publicly available data about facebook from between 2009 and 2012.

These media are then later assume to be consumed by other users of the system. In the case of services-with-history (e.g. FB), I model the peak and average QPS of data as a fraction of the total amount of data cumulatively stored. In the case of ephemeral networks (e.g. Kik, Snapchat), we model storage and bandwith by estimating the fanout of each message, and the number of messages that are never received (leftover messages are assumed to be stored forever).

Storage and bandwidth costs are estimated based on costs on EC2 and other similar systems.

Feel free to play around with the numbers.

The code is available on github: Web service storage cost simulator.

Questions or comments? Use twitter to reach me: @vijayp (Vijay Pandurangan)

16 January

Update / Solution for broken Android calendar syncing

In my last post, I described how Android’s calendar syncing was broken for me. I noticed that my calendar on my phone was out of date, and when I manually refreshed, I’d get a force-close error.

After downloading the Android source, figuring out how to build, and playing with it on the emulator and my device for some time, I have figured out what the problem is, and have a work-around for it. Essentially some repeated events can have a start-date Android is unhappy with (I believe it’s due to a start time of UTC 0). This causes an Android core library to throw an TimeFormatException which is never properly handled, preempting syncing. This is a pretty big bug — that exception should be caught by Google’s common calendar code, but the exception is ignored. (This is because of the misuse of unchecked exceptions — android.util.TimeFormatException inherits from RuntimeException for no good reason at all that I can see. Checked exceptions are one of the best features of Java, and inheriting from RuntimeException for things that should be handled is a really bad idea, IMO.).

Here is the text of the item that was breaking my calendar syncing:


This was in the private url for my feed. You can see yours here:
https://www.google.com/calendar/feeds/USER_NAME%40gmail.com/private/full. I think this event was added by Outlook somehow, but I’m not really sure. The web UI and other clients have no problem dealing with this event, but Android’s date parser is unhappy with it. If you’re seeing repeated calendar syncing crashes, go to the above url, replace USER_NAME with your user id, and see if you have something similar to this string. If so, deleting that event ought to fix syncing.

How Google should fix this

If someone on Android or Calendar is reading this, there are two ways this should be fixed. Please do both of them!

  1. Fix Android to handle these errors gracefully. I patched the provider code to fix this bug. Someone should fix this, and include it in the next ICS update. Here’s the diff:

    vijayp@thecoon:/mnt/largelinux/bigfiles/as2/frameworks/opt/calendar/src/com/android/calendarcommon$ git diff -w
    diff --git a/src/com/android/calendarcommon/RecurrenceSet.java b/src/com/android/calendarcommon/RecurrenceSet.java
    index 3b91a1d..8e1117e 100644
    --- a/src/com/android/calendarcommon/RecurrenceSet.java
    +++ b/src/com/android/calendarcommon/RecurrenceSet.java
    @@ -178,6 +178,7 @@ public class RecurrenceSet {
    public static boolean populateContentValues(ICalendar.Component component,
    ContentValues values) {
    + try {
    ICalendar.Property dtstartProperty =
    String dtstart = dtstartProperty.getValue();
    @@ -233,6 +234,11 @@ public class RecurrenceSet {
    values.put(CalendarContract.Events.DURATION, duration);
    values.put(CalendarContract.Events.ALL_DAY, allDay ? 1 : 0);
    return true;
    + } catch (TimeFormatException e) {
    + // This happens when the data is out of range.
    + Log.i(TAG, "BAD data: " + component.toString());
    + return false;
    + }

  2. Patch the calendar FE server to remove things that break android. Fixing Android is the correct solution because it’s unclear that the data it is passing are actually bad. But since the Calendar Frontend can be fixed in a few days, and it might take months (or years!) to get carriers to agree to roll out an Android update, it’s best to just patch the Calendar FE to filter out data that might cause Android to crash. It can even be enabled based on the useragent.

Anyway, I really hope someone at Google reads and fixes this. I spent a lot of unnecessary time tracking this down!

14 November

Migrated Partychat rooms and Google Apps domains

Due to App Engine cost changes, I’ve been working with the partychat folks to migrate our services to a new domain (new rooms are channel@im.partych.at).

We’re seeing a lot of people who are using accounts on Google Apps domains having difficulty connecting to the new Partychat services.

Simple solutions

If you are using a Google Apps domain, these instructions (from Google) will help you get partychat working again. This will require help from someone with access to your domain settings (probably a system administrator).

If you don’t have access to DNS records, or can’t find someone who does, you will have to use a @gmail.com account instead.

Technical Details

Every domain needs to have a SRV DNS record to tell other XMPP servers where to connect (if the bare domain does has no record). The SRV record’s name should be “_xmpp-server._tcp.domain.com.” This doesn’t just affect partychat, it prevents most people on non-Google third-party domains from being able to talk to you.

You can check if your server has one by executing the following (change mydomain.com to the name of your domain):

vijayp@ike:~/src$ nslookup
> set q=SRV
> _xmpp-server._tcp.mydomain.com

** server can't find _xmpp-server._tcp.mydomain.com: NXDOMAIN

As you can see, mydomain.com doesn’t have a record, so our servers don’t know where to send your chat messages. Here is an example of a properly configured domain:

vijayp@ike:~/src$ nslookup
> set q=SRV
> _xmpp-server._tcp.q00p.net

Non-authoritative answer:
_xmpp-server._tcp.q00p.net service = 5 0 5269 xmpp-server.l.google.com.
_xmpp-server._tcp.q00p.net service = 20 0 5269 xmpp-server1.l.google.com.
_xmpp-server._tcp.q00p.net service = 20 0 5269 xmpp-server2.l.google.com.
_xmpp-server._tcp.q00p.net service = 20 0 5269 xmpp-server3.l.google.com.
_xmpp-server._tcp.q00p.net service = 20 0 5269 xmpp-server4.l.google.com.

29 September

Why Eclipse’s “Check for Updates” is horribly slow (and how to fix it)

I recently installed Eclipse Indigo. I wanted to add a few plugins to it, so I tried to use the UI to check for new updates and install some new packages. I let it run for a while, and after about 45 minutes, it looked to be about 20% done. Eventually, it displayed a few errors about timing out.

The issue is that Eclipse appears to be trying to contact mirrors that don’t have a proper copy of all the files it’s expecting. My solution was to invoke eclipse with the following flag. Add it after “eclipse”, or in eclipse.ini

29 September

Attaching a physical (raw) disk to VMWare Fusion 4 without BootCamp

I wanted to boot and run my Linux installation from a physical disk inside Mac OS X. There’s no easy guide for this on the web; most want you to use a vmware tool that existed in previous versions in /Library/Application Support/VM* but that file didn’t exist for me.
I think the new VMWare Fusion can read BootCamp config data automatically, but I didn’t want to use BootCamp (long story). Since I had VirtualBox installed, this wasn’t too difficult.

First off, figure out what the mac thinks your disk(s) are called:

chef:ubuntu_test.vmwarevm vijayp$ diskutil list
0: *64.0 GB disk1
0: GUID_partition_scheme *2.0 TB disk3
1: Linux Swap 16.5 GB disk3s1
2: Microsoft Basic Data 983.5 GB disk3s2
3: Microsoft Basic Data Untitled 899.4 GB disk3s3

My main drive was /dev/disk1 (for some reason, I decided to use the entire disk for the linux partition) and the data partition was /dev/disk3s2.

After installing VMWare fusion 4, I created a new custom VM set up as Ubuntu 64-bit. This turned up in my Documents folder:

chef:~ vijayp$ cd ~/Documents/Virtual\ Machines.localized/
chef:Virtual Machines.localized vijayp$ ls
Ubuntu 64-bit.vmwarevm
chef:Virtual Machines.localized vijayp$ cd Ubuntu\ 64-bit.vmwarevm/
chef:Ubuntu 64-bit.vmwarevm vijayp$ ls
Ubuntu 64-bit-s001.vmdk Ubuntu 64-bit-s007.vmdk Ubuntu 64-bit.vmdk
Ubuntu 64-bit-s002.vmdk Ubuntu 64-bit-s008.vmdk Ubuntu 64-bit.vmsd
Ubuntu 64-bit-s003.vmdk Ubuntu 64-bit-s009.vmdk Ubuntu 64-bit.vmx
Ubuntu 64-bit-s004.vmdk Ubuntu 64-bit-s010.vmdk Ubuntu 64-bit.vmx.lck
Ubuntu 64-bit-s005.vmdk Ubuntu 64-bit-s011.vmdk Ubuntu 64-bit.vmxf
Ubuntu 64-bit-s006.vmdk Ubuntu 64-bit.plist vmware.log

VMWare has created a default disk that’s striped into 11 pieces (see the *.vmdk files). In order to access the physical drives, I used virtualbox’s toolkit:

chef:Ubuntu 64-bit.vmwarevm vijayp$ sudo VBoxManage internalcommands createrawvmdk -filename disk1.vmdk -rawdisk /dev/disk1
chef:Ubuntu 64-bit.vmwarevm vijayp$ sudo VBoxManage internalcommands createrawvmdk -filename disk3s2.vmdk -rawdisk /dev/disk3s2
chef:Ubuntu 64-bit.vmwarevm vijayp$ sudo chown $USER disk*.vmdk

Next you have to edit the VMWare file manually to add the disks, and remove the default one. I’m not sure why the UI won’t let you select these vmdks, but it doesn’t. Make sure the vm is NOT RUNNING, then edit the file. The diffs are pretty trivial:

@@ -2,16 +2,20 @@
config.version = "8"
virtualHW.version = "8"
vcpu.hotadd = "TRUE"
scsi0.present = "TRUE"
scsi0.virtualDev = "lsilogic"
+scsi1.present = "TRUE"
+scsi1.virtualDev = "lsilogic"
memsize = "1024"
mem.hotadd = "TRUE"
scsi0:0.present = "TRUE"
-scsi0:0.fileName = "Ubuntu 64-bit.vmdk"
+scsi0:0.fileName = "disk1.vmdk"
+scsi1:0.present = "TRUE"
+scsi1:0.fileName = "disk3s2.vmdk"
ide1:0.present = "TRUE"
-ide1:0.autodetect = "TRUE"
+ide1:0.fileName = "cdrom0"
ide1:0.deviceType = "cdrom-raw"
ethernet0.present = "TRUE"
ethernet0.connectionType = "nat"
ethernet0.virtualDev = "e1000"
ethernet0.wakeOnPcktRcv = "FALSE"

Now you can delete the Ubuntu 64-bit*.vmdk files.

I still haven’t figured out how to set the UUID on these disks so linux mounts them correctly, but it’s probably one of ddb.uuid.image and ddb.longContentID in the vmdk file. But it boots, so I can get some work done. I’ll revisit the uuid stuff soon.

11 August

JetBlue stores plaintext passwords — and emails them too! Ugh.

I recently had a bad experience flying (or trying to fly, I guess) JetBlue. When I called in to ask for a refund on my ticket, the customer service agent and her supervisor were very helpful and gave me a credit. This resulted in an automated email from JetBlue telling me that a TravelBank account had been created for me. It contained my TravelBank account number, my email and my plaintext password from my Jetblue account!!

As anyone who knows anything about computer security would know, you should never, ever store plaintext passwords in a database. Not even because you want to let people recover their passwords when they forget them. And never never send them via e-mail, an insecure medium.

I’ve always been disappointed with the quality of JetBlue’s website, but the fact that they have not even followed basic security procedures is really scary. This isn’t just academic, Reddit did something similar and then lost a copy of their DB, which gave hackers a long list of (email, password) pairs. Since many people use the same password all over the place, this is especially dangerous — having a very complex password may prevent hackers from figuring out your password from a hash, but is useless if they’re stored as plain text.

If any developer at JetBlue is reading this, you really need to do the following:

  1. Stop emailing passwords in the clear
  2. Start storing passwords using something secure, like PBKDF1 (RFC 2898)
  3. Please don’t use something like MD5 or SHA-128 for hashing passwords. Why? Read this thread.

I’ve changed my password and will avoid using JetBlue until they fix this.
This kind of thing really happens too often — in fact, just recently Pingdom was discovered to store passwords similarly, and was widely criticized. So let this be a good lesson — everyone should use different passwords for each different site, and we should just listen to XKCD’s advice about passwords.

Here’s the text of the email I received:

Thank you for choosing JetBlue and welcome to our new credit tool, Travel Bank.
Travel Bank is an online account that allows customers to manage their credits with JetBlue. It will replace the current vouchers and credit shells that may be familiar to you. For our TrueBlue members however, TrueBlue points will still be managed as a part of the TrueBlue account. For more detailed information regarding Travel Bank and your credits, click here.
A Travel Bank account has been created for you and transactions can be viewed online by clicking Here. 

Below you will find your account number and login information. Please keep this email as it is the only password notification you will receive. You will need to enter the following Travel Bank login ID and password when accessing your Travel Bank account online.

Travel Bank Account Number: YYYYYYYYYYYYY


Password: XXXXX

20 June

mounting large (> 2TB) hfsplus (mac) partitions on linux / ubuntu

I recently wanted to read an external drive which I’d formatted under Mac OS with Linux. Unfortunately, it was > 2TB, which seems to not be supported under Linux. It appears as if many of the vulnerabilities which used to exist (use of 32-bit values instead of 64-bit ones) seems to have been fixed, but the kernel still refuses to allow mounting.  I figured out how to patch the kernel module to allow me to mount > 2TB partitions.

Important: I only intended to mount the partitions read-only, so I didn’t go through the code carefully to ensure that it won’t corrupt your data if you try to write to it! Use this at your own risk!!

  1. Using Ubuntu 11.04, upgrade to the latest kernel
  2. root@mysterion:~# apt-get install linux-image-2.6.38-8-generic && apt-get install linux-headers-2.6.38-8-generic &&
    apt-get install linux-source-2.6.38 && reboot

  3. Next, unzip the source

  4. root@mysterion:~# cd /usr/src/linux-source-2.6.38
    root@mysterion:/usr/src/linux-source-2.6.38# tar xfvp linux-source-2.6.38.tar.bz2

  5. Next, patch the errant file; comment out the “goto out” code so it looks like this:

  6. root@mysterion:/usr/src/linux-source-2.6.38# grep -A 3 supported linux-source-2.6.38/fs/hfsplus/wrapper.c
    pr_err("hfs: volumes larger than 2TB are not supported yet\n");
    //goto out;

  7. Next, copy some files needed to build the module, and fix up the makefile:

  8. root@mysterion:/usr/src/linux-source-2.6.38/linux-source-2.6.38# kernver=$(uname -r)
    root@mysterion:/usr/src/linux-source-2.6.38/linux-source-2.6.38# kernextraver=$(echo $kernver | sed "s/$kernbase\(.*\)/\1/")
    root@mysterion:/usr/src/linux-source-2.6.38/linux-source-2.6.38# sed -i "s/EXTRAVERSION = .*/EXTRAVERSION = $kernextraver/" Makefile
    root@mysterion:/usr/src/linux-source-2.6.38/linux-source-2.6.38# cp /usr/src/linux-headers-2.6.38-8-generic/Module.symvers .
    root@mysterion:/usr/src/linux-source-2.6.38/linux-source-2.6.38# cp /boot/config-2.6.38-8-generic .
    root@mysterion:/usr/src/linux-source-2.6.38# make oldconfig && make prepare && make modules_prepare && make SUBDIRS=fs/hfsplus/ modules

  9. at this point, you should be able to run
    insmod fs/hfsplus/hfsplus.ko
    and then mount your hfsplus partition, using -oforce . please also use -oro for now.
31 May

Get control-left and control-right to move between words on Mac OS X

The default key bindings on OS X really annoy me to no end.  The strange behaviour of home/end continue to confound me, but I finally figured out how to get Terminal, iTerm and iTerm2 to allow me to go between words using control-left and control-right. There are ways to do this by mucking around in various menus, but if you use bash, the simplest way is to add this to you ~/.inputrc file. It adds a bunch of different possible codes that various mac terminals might try to send instead of what bash normally expects for control-left and control-right.

I still haven’t been able to fix this in non-terminal things (e.g. in Chrome control-left sends me to the beginning of the line), but I suppose this is a start!

chef:~ vijayp$ cat ~/.inputrc
"\e[1;5C": forward-word
"\e[1;5D": backward-word
"\e[5C": forward-word
"\e[5D": backward-word
"\e\e[C": forward-word
"\e\e[D": backward-word

29 March

It turns out /etc/profile and basrc are insufficient for GUI apps. You have to add stuff to some plist for system-wide paths.

Anyway I needed ndk-build in my path for Eclipse to auto-build my JNI/android code, so I created this file:

chef:jni vijayp$ cat ~/.MacOSX/environment.plist

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">

<plist version="1.0">