Sam Gerstenzang

Jul 09 2014

The peace dividend of the self-driving car wars

Making self-driving cars for mass consumption is difficult. Why?

1. The software. Problems involving computer vision and perception, processing low resolution information and decision-making.

2. The sensors. Far from solved in ambiguous environments. It seems like the current approach is a cocktail of sensors that mitigate failure or low resolution in any particular type of sensor and environment. How do these collections of sensors handle grime, snow and rain?

3. Cars are dangerous, which creates a whole set of regulatory and ethical questions, as well as requiring extremely high precision for (1) and (2.)

4. Cars themselves are difficult and costly to manufacture. They require hundreds of thousands of parts to be acquired on time, pieced together and tested– this is no trivial feat.

But what if you didn’t have to worry about (3) and (4)?

Well, you wouldn’t have a car. But you might have something very useful that can be built today, on the back of the expertise and technology that has been developed in the labs of Stanford, CMU, Google, BMW, Mercedes, Volkswagen and many more. The expertise developed in these labs manifests itself not just in the final self-driving vehicle, but in its byproduct as well: papers, algorithms, sensors, software libraries and people.

Chris Anderson has a wonderful phrase, “the peace dividends of the smartphone wars.” What he means is this: Apple, Google, Samsung and others have invested billions of dollars in smartphones, and the benefits of their investments have far-reaching effects outside of our pocket supercomputers. Think better, faster and cheaper ARM processors, newly commoditized sensors (cameras, GPS, accelerometers) and SSDs that are cheaper, more reliable and perform better. The Raspberry Pi would not have been possible without the smartphone, nor would the array of smart wristbands or drones that are available today.

What, in essence, you might be able to do for self-driving cars is harvest the dividend before the war is over. While the complexity of self-driving cars (regulatory, ethical and otherwise) will delay the arrival of the fully autonomous car by five, ten or fifteen years, the technology in some deployable form will exist before then.

The other advantage you’ll have in applying self-driving technology outside of automotive vehicles is that you’ll be able to deploy it in an environment where you control the surrounding infrastructure– augmenting onboard sensors with data from the environment and creating more predictable surroundings.

Where might this approach work? Toys, factory automation, household robotics and farming are all areas where this has already begun, but I think we are only at the beginning of discovering our great automated world.

Jul 06 2014

The future photograph

[Ed. note: My colleague Benedict’s great post on photo volume growth encouraged me to dig up the following from my drafts folder,  although I disagree with his provocative conclusion.]

When an expensive thing becomes free, new behaviors emerge. When a private thing becomes public, new behaviors emerge. Below, some thoughts on the swirling cloud of photographs that we are creating together.

  1. We are in the middle of a redefinition of what constituents value in a photograph. On one hand, photographs are cheaper, which should lower the received value necessary to justify the taking of the photograph. On the other hand, abundance creates scarcity of attention and we now have always-on instant access to the best photographs, which devalues the amateur. Going forward, personal photographs will have two purposes: 1) memory of personal circumstance and 2) personal expression. What kinds of photographs will be taken when we have perfect information? The future photographer will ask themselves, what are the odds that this picture, but better, doesn’t exist already?
  2. Selfies take a timeless outlet of self-expression (the self-portrait) and combines it with the fact that our self-representations have become a stream of cheap, shared updates. The selfie is a way to share what you are feeling, doing and seeing at this very moment: a much higher fidelity version of an emoji. I am very long (and encouraged by) the selfie.
  3. Photos now have four distinct kinds of ownership: photographer, file owner, publisher and subject. While this was technically true before the smartphone camera, the decreased cost of publication puts new power in the hands of the publisher, and the subject loses. This tension between subject and publisher is largely responsible for the rise of Snapchat (where subject and publisher are often the same person.)
  4. The camera is the most creative sensor on the smartphone because it is the easiest to create with. It is easy to create with because you start with the world as your raw material instead of a white screen, and because clever software makers (Instagram) have realized that gentle abstraction (filters) can obscure our own poor taste. It has long been easier to be a photographer than a painter, because the worst case in photography is at least something.

Mar 24 2014

How might a public market investor invest in self-driving cars?

I am not convinced that the winner will be Google, nor can I can choose between the existing car manufacturers. If you were to attempt to invest in self-driving cars as a phenomenon, what would you invest in? What is a proxy investment for the public investor? 

Below are a few of my investment ideas, although I’m afraid all are subject to many other forces beyond self-driving cars. I’d love to hear your ideas in the comments below or via email– what is the purest proxy for self-driving cars?


Suburban housing… self-driving cars will mean less time spent driving and parking, meaning living further away from urban centers will be more tolerable. Expect suburban property values at medium distances from major cities to increase in value.

Entertainment companies… driving commuters will now have hours of newly free time in their week. This time will need to be filled with images, sounds and text.

Unemployment… professional drivers (taxis, delivery folks, garbage collectors, etc.) will be out of a job. Optimistically, invest in retraining programs. Pessimistically, invest in inferior goods, and more pessimistically, riot gear.

E-commerce companies… as if there weren’t already reason enough to be long on e-commerce, self-driving cars will reduce shipping costs and time qualitatively. A whole new class of e-comerce companies will be built. I think of Russia’s e-commerce companies.


Oil companies…. a combination of more efficient routing and driving, along with mission-specialized vehicles will reduce the need for fossil fuels. Electric self-driving cars can also smartly recharge on off hours or be swapped.

Motels, rest stops, off highway restaurants…the car will be its own self-contained entertainment pod and the only stops along the way will be intentional.

Parking lots… increased car utilization and a decreased need for strategically local parking lots (just let the car circle the block if you don’t want to loan it out– or park a few miles away) will reduce the market for paying parking premiums.

Insurance companies… in the short term self-driving cars will be a boon for insurance companies as accidents will decrease along with payouts. In the long term, the consumer and regulation will catch up and car insurance will no longer be necessary or required.

[Footnote: As a former Apple fanboy– I can’t help but note that an iCar would play to all of Apple’s strengths: existing large crowded market, bificurated luxury sector, rewards deep technology innovation (battery, AI), while ultimately being about the user experience. Not to mention Apple is one of the few companies that will have a great map asset in 5 years. But enough on that.]

Mar 05 2014

A few quick thoughts on speed reading and Spritz

A few friends shared with Spritz with me knowing my interest in augmented cognition. Maximizing information input has been something I’ve been researching for some time, including through more efficient written languages. I have long wished it was as easy to read books as to buy them.

And I actually wrote a prototype speed reader a few years ago based on the underlying technique that Spitz uses– RSVP, to minimize saccade movements. What’s new with Spritz with is what they call the “Optimal Recognition Point”– where they find the optimal word location on the screen, instead just left-aligning as I did. Their theory sounds plausible, although I’d like to see a comprehension study. 

However, the two biggest problems with my speed reader weren’t related to this point.

1. One of the challenges I ran up in to was acquiring source material. I used the Instapaper API and a bookmarklet but we read from many different sources. How do you do integration with your mail client, Kindle and your web browser? Unless you can integrate at the point of consumption, it’s very hard to get adoption. Moreover, I found a larger amount of text I read than I realized wasn’t digital (yet.)

2. There are many text sources for which the technique just doesn’t work at all. Text formatting conveys lot of meaning and this often lost context when you speed read using RSVP. For example, bullets, longer passages in parentheses, footnotes and intentional spacing are all extremely difficult to understand, if not impossible with RSVP. And some texts are simply too information dense to read this way– anything that requires you to read more than linearly becomes extremely difficult.

I’m extremely excited to try out Spritz as soon as it launches.

Dec 25 2013

The next big thing will look like a toy pet

[Originally posted on Medium.]


 (Lucy, my parents’ havapoo.)

A few weeks ago I tweeted asking if there was a modern Tamagotchi-like or Tamogatchi-inspired smartphone app and got a few helpful pointers. But I can’t help but feel like these fall short of their potential. With the modern smartphone, we can do better than updating the original Tamagotchi with better graphics. Smartphones don’t just have nicer displays; they are faster, are network-connected, have more sensors and smarter software. The modern Tamogatchi should take advantage of all these things. But the original Tamogatchi missed something even more important.Pets aren’t just something we take care of. They take care of us, too.

And increasingly, so do our smartphones. We use apps and gadgets to track our eating, activities and habits. We create all kinds of other data too: your smartphone should know what kind of mood you’re in based on the language you use in messages and emails, the songs you listen to on Spotify, the number of of times you’ve opened Twitter in the last hour. But the exhaust of our digital (and digitally measured) lives increasingly seems like it is going nowhere.Fitness wristbands are like strings tied around our fingers: useful reminders to keep going, but falling short of the grand promises of the data synced to our pocket computers and cloud mothers.

Separately, researchers at MIT and other places are doing the hard work of figuring out how to create socially engaging robots. But I wonder if the physical manifestation of the robot is a red herring, unnecessarily increasing the complexity of making compelling AI-driven friends, mentors and coaches for humans. The late, great Clifford Nass’s seminal research Computers Are Social Actorssuggests that perhaps we don’t need to mimic humans in physical space to create real social connections between man and machine.

We are attempting to create a health, wellness and happiness platform off the exhaust of data from our smartphones but we are missing the most important part. Not friends, not followers: the social interface between you and your device. Forget dashboards of metrics. Your personal pocket computer should be your friend, pet, coach and mentor using data to make decisions, but using its connection with you to make a difference.

In other words, the modern Tamogatchi would look like a dog if we had evolved dogs from Siri instead of wolves. Maybe the next big thing won’t look like a toy, but a toy pet.

Further reading:

1. Robot & Frank (2012)

2. Her (2013)

3. Jane (Ender’s Game)

Dec 23 2013

Conversational thinking

The idea that you are “the average of your five closest friends” is well-traveled, but I would like to suggest a more practical and micro variant. It is something we all know intuitively, but perhaps should be used more mindfully. It is:

Given that you are who you are and you know what you know, changing the person with whom you discuss an idea is by far the most impactful thing you can do to your thinking.

Whomever you talk with will encourage you, laugh at you, suggest an analogy, push you, or introduce a tangent– and you will find your ideas morphing, growing and perhaps (but hopefully not) shrinking. But every person will do so differently, and differently depending on which particular ideas you place in front of them. Putting your ideas in conversation changes them, and the person on the other side of that conversation is responsible for that change.

Different people will provoke different thought chains, and some topics are best discussed with one friend, while other topics should be saved for someone else. My conversation with David, and the ideas that came out of it, could have only happened with David.

So choose your conversations consciously, and choose well. For your ideas, and for you, it will make all the difference.

Dec 13 2013

Rewritten thinking: writing and rewriting as a tool for thinking

[Originally posted on Medium.]

David and I were talking the other day about a piece of code he was writing. He had rewritten it five or six times, and was probably going to rewrite it once or twice more. Each rewrite was a rethinking; a re-imagination of the architecture needed to support a new feature. The interesting thing about this process was that code at the end of each rewrite was almost accidental: simply an artifact of the thinking process that happened to be executable

Writing is like this too: each draft between the first draft and the final product is an artifact of the thinking process that happened to be readable. I find that better writers tend to think this way, for it is both a solution to writer’s block and a reminder that good writing is good editing. Each draft is merely part of the thinking process, and one must perhaps purposely ignore the fact that it could, if one stopped, be read.

You could theoretically just jump to the end, skipping the physical manifestation of each step on the mental journey. But then we would lose the power of the written artifact, which allows us to move up or down between the levels of the work. In both writing and coding, we must understand the whole while mucking about in the details and this becomes only more difficult as the ideas become more complex. The external recording serves as an augmentation to our own working memory, allowing the writer to move up, down and sideways through their own thoughts.

We would think lesser thoughts if we could not pause to write them down, and pause to reflect and rethink. Writing is a thinking tool, and not merely a tool towards a different medium of output.

At the end of the process, the written artifact is what provides value to others: an essay or an app. But the movement of the mind in synchronization with the ink or bytes being written and rewritten is where the important work happens. This tool seems simple, but it is not: it is our civilization.

Nov 08 2013

Voice input isn’t going to save you

Bill Gates, October 1st 1997:

In this 10-year time frame, I believe that we’ll not only be using the keyboard and the mouse to interact, but during that time we will have perfected speech recognition and speech output well enough that those will become a standard part of the interface.

Well, 6 years after Gates’ time frame ended, “perfect” voice input and output is not yet a user interface standard. It appears to be the only hope for mobile purists, who have yet to find a proper input replacement for ye olde keyboard.

I’m here to tell you: you better find a something new. Voice input isn’t going to save you.

It’s not about latency, it’s not about quality of voice recognition (Siri aside…which has both problems.) Have you used Google’s mobile voice recognition? I have a hard time confusing it, and it recognizes my words as soon as I speak them (sometimes before: thanks, autocomplete.) At what tipping point will it be “good enough” for you realize it’s the modality itself?

The two major problems with voice commands are:

1. Voice control systems do not have very good affordance. That is to say it’s hard to know what the system allows you do and what syntax is required. A graphical user interface makes designing for affordance much easier: if a toggle exists on screen it is something you can change, and its image can suggest how to manipulate the value along with its possible states. On a voice controlled system, like on the command line, one must do more exploring.

One solution to this problem is for voice systems to allow you do anything: any possible query will give you a directional answer. This is one of reasons it is nicer to use Google search via voice command than Siri. Google has a relatively unlimited argument space (anything that can converted in to text) while Siri has what feels like an arbitrary and bizarre set of functional restrictions.

Perhaps you will one day be able to say anything in any format and have the system understand you and execute. This is essentially an AI problem. But even if we solve this (I am more optimistic than most people, less optimistic than the Singularity folks), the second problem remains.

2. By far the bigger problem is this: it feels weird to interact with a computer or smartphone with your voice. I don’t want to stand in line at the post office dictating my bizarre Wikipedia queries: I don’t even want to do that while my girlfriend is in the next room. It is so uncomfortable that I won’t even do it sitting home alone or walking down an empty street. It is hard to change this cultural norm, and I don’t think we will. The only time I feel comfortable using voice is in the car, and that is where I expect it will stay for most users.


The interfaces that do allow the next generation of input will be native to their host systems, not buckled on later like voice. On the tablet, this means fingers on glass. What will need to change is the metaphors, and the software that communicates the metaphors to us, and us to them.

Nov 07 2013

Information as compensation

We so-called “knowledge workers” have quickly become accustomed to new criteria for job selection, like the nature of the work and the goals and values of the company. In earlier times and in other fields and places today, the work is what is needed to be done. The goal is to survive. Perhaps the phrase “job selection” itself, suggesting a choice in the matter, reveals our privileged position.

But within this privileged class of “knowledge workers,” a new job criteria (or perhaps form of compensation) has arrived.

This is access to information. Scheduled guest talks, access to databases, networks of people with the right information, Kanye West rapping at the Twitter headquarters: these can all be proprietary information sources.

This access can be considered compensation, but it is not purely fungible with money. The value of access to some kinds of information might be simple intellectual joy, particularly appealing to infovores like me. Other kinds of information might provide a proprietary view of the world for later career or financial gain, or a training set for rapid self-improvement. 

"Information as experience" is rapidly becoming a key part of work and startups will have to be creative (as always) in competing against companies with bigger brands, bigger budgets and bigger networks. What kind of informational advantage can you provide your employees?

Nov 03 2013

Announcing the Center for Augmented Cognition

I’d like to share a small project I’ve been working on: The Center for Augmented Cognition

"The Center" part of the site’s title is more a statement of purpose than a reflection of reality: I hope over time to create the best resource for understanding, researching and furthering augmented cognition. Today it is a small collection of links that hopefully you’ll gain something from.

I have been writing about augmented cognition for five years now, and have thinking about it since I was eight years old. At eight I learned to read: much later than most children. Perhaps this is why I have such respect for the mind-altering tools we humans can create, starting with language itself and moving to higher abstractions of symbol manipulation.

The site’s introductory paragraph reads:

Augmented cognition is about making tools for thinking. It is not about designing tools that humans can use, but about extending humanity’s abilities through software, hardware or conceptual tools. We focus mainly on software here, curating a collection of links and resources.

I plan on updating AugCog regularly, as I find new readings, projects and resources. I’d love your help. Please send links, ideas or anything else to my first and last name at

Here’s to the toolmakers. 

Page 1 of 8