Publications by Edwin Chen

Soda vs. Pop with Twitter

06.07.2012

One of the great things about Twitter is that it’s a global conversation anyone can join anytime. Eavesdropping on the world, what what! Of course, it gets even better when you can mine all this chatter to study the way humans live and interact. For example, how do people in New York City differ from those in Silicon Valley? We tend to think th...

3754 sym 22 img

Edge Prediction in a Social Graph: My Solution to Facebook’s User Recommendation Contest on Kaggle

31.07.2012

A couple weeks ago, Facebook launched a link prediction contest on Kaggle, with the goal of recommending missing edges in a social graph. I love investigating social networks, so I dug around a little, and since I did well enough to score one of the coveted prizes, I’ll share my approach here. (For some background, the contest provided a traini...

23172 sym Python (12805 sym/18 pcs) 46 img 9 tbl

Improving Twitter Search with Real-Time Human Computation

07.01.2013

(This is a post from the Twitter Engineering Blog that I wrote with Alpa Jain.) One of the magical things about Twitter is that it opens a window to the world in real-time. An event happens, and just seconds later, it’s shared for people across the planet to see. Consider, for example, what happened when Flight 1549 crashed in the Hudson. http:...

11927 sym

Propensity Modeling, Causal Inference, and Discovering Drivers of Growth

14.08.2014

Imagine you just started a job at a new company. You watched World War Z recently, so you’re in a skeptical mood, and given that your last two startups failed from what you believe to be a lack of data, you’re giving everything an extra critical eye. You start by thinking about the impact of the sales team. How much extra revenue are they gen...

17081 sym 8 img

Moving Beyond CTR: Better Recommendations Through Human Evaluation

06.10.2014

Imagine you’re building a recommendation algorithm for your new online site. How do you measure its quality, to make sure that it’s sending users relevant and personalized content? Click-through rate may be your initial hope…but after a bit of thought, it’s not clear that it’s the best metric after all. Take Google’s search engine. In...

18585 sym 18 img

Product Insights for Airbnb

19.11.2015

I love marketplaces and marketplace data, so a couple months ago I grabbed some Airbnb data and made a slide deck. A few people have asked me about it, so here it is along with a short summary. My goal was to gather data around potential product strategy, focusing on the following questions. You can’t book a great place if you can’t find on...

12374 sym 32 img

Exploring LSTMs

29.05.2017

The first time I learned about LSTMs, my eyes glazed over. Not in a good, jelly donut kind of way. It turns out LSTMs are a fairly simple extension to neural networks, and they’re behind a lot of the amazing achievements deep learning has made in the past few years. So I’ll try to present them as intuitively as possible – in such a way that...

25710 sym R (321 sym/5 pcs) 72 img

Surge: Data Labeling You Can Trust

29.11.2020

tl;dr I started Surge earlier this year to fix the problems I’ve always encountered with getting high-quality, human-labeled data at scale. Think MTurk 2.0—but with an obsessive focus on quality and speed, and an elite workforce you can trust. If you’ve ever had problems getting human-annotated data … Related To leave a comment for the ...

702 sym