MonthJune 2016

Influencing a Boston Angel: Dharmesh Shah

I took a class in my last semester of business school in 2013 that was about analyzing networks. For our final project, I worked on a team that analyzedDharmesh Shah’s twitter followers. The goal was to identify who might be a good candidate to make an introduction. Below is the complete writeup as we submitted it to our professor, Marissa King.


Problem

Dharmesh Shah is an influential entrepreneur and angel investor. He is the co-founder and CTO of HubSpot, a technology company headquartered in Boston. Dharmesh invests in many early-stage startup companies each year, and entrepreneurs routinely court him as a mentor and investor. As an incredibly busy executive and investor, Dharmesh is not an easy man with whom to get an audience.

Our group set out to analyze Dharmesh’s network to find the most influential people. By identifying the most connected people in his circles and the networks in which they operate, someone could prioritize their efforts in getting introductions.

Hypothesis

A strategy to influence Dharmesh starts with influencing those who can influence him. Therefore, we built our analysis on two hypotheses:

  1. Dharmesh’s network looks similar to our own, in that it has important sub-networks.
  2. Within these communities, there are people who can influence Dharmesh.

If network analysis can identify these influential individuals, one could effectively surround Dharmesh, gaining connections to him from a variety of his networks.

Methodology

Our analysis uses information gathered from Twitter rather than LinkedIn or Facebook. Twitter is unique and different from these two social networks because it is public by default. Twitter has an asymmetric follower pattern where anyone can subscribe to the updates of another person; both parties do not have to choose to connect. Since many in the technology community use Twitter as a news and information service, it would be a good indication of whom someone respects and looks to for interesting and influential information.

To analyze who is influential to Dharmesh, the analysis focused on people Dharmesh currently follows. Through the twitter API, we downloaded:

  1. The twitter accounts that Dharmesh follows
  2. The twitter accounts that follow those accounts

We downloaded over 10 million pieces of follower information as pairs of directed edges (the people that influence Dharmesh, and the people that follow those influencers). We put the data into a relational database so that we could model the edges and query it on an ad-hoc basis.

In order to determine the influencers within the network of people that Dharmesh follows, we created a graph of the mutual connections. We only graphed a connection between two people if they both followed each other. This removed many edges in our graph because many relationships only had a single directed edge. We felt that this was a better indication of a relationship and would highlight communities of influence more effectively.

Analysis

Looking at Twitter data instead of Facebook or LinkedIn has the advantage of portraying what Dharmesh is currently working on and thinking about, as opposed to his entire personal or professional network. This has the benefit of identifying what will pique his interest, since we assumed that he only follows people that share content that is interesting to Dharmesh.

In analyzing the network graph, there are clusters that represent the Boston startup community, Silicon Valley, and HubSpot employees and alumni. Contrary to the graphs of our personal network analysis, the groups were highly integrated with one another and were hard to distinguish. We believe this to be the case because anyone can follow anyone else on Twitter; there is no expectation of being friends or having worked together professionally. If someone is sharing interesting content on Twitter, individuals are accustomed to following others they may not have met in real life. We believe that explains the lack of separation of subgroups that are present in the Facebook and LinkedIn network graphs.

Our group expected to see more subgroups that are distinct in Dharmesh’s network of influencers. In our professional and personal graphs, we each had communities that represented high school, college, professional groups, and graduate school networks. We were only able to identify three separate sub groups in Dharmesh’s graph, with only one company and two regional communities. While surprising at first, we believe it is driven by the interaction that takes place on Twitter. Rather than accumulating contacts, Twitter is about what is interesting to you at the current time and many users regularly unfollow others based on their tweets. This is very different from Facebook or LinkedIn, where you rarely remove a friend.

It is not surprising that Silicon Valley represented a significant element of Dharmesh’s graph of influencers, since the region is the largest in terms of venture capital and startups. Boston did not represent more of the influencers graph, but again that may have been influenced by the fact that Silicon Valley is responsible for a majority of the innovation in the technology and startup industry. There were two subgroups that represented members of the Boston startup community, which is interesting considering that HubSpot has been one of the fastest growing startups in Boston for many years. For any entrepreneur looking to gain access to Dharmesh, it represents two opportunities for identifying individuals. Additionally, it may indicate that there are few influencers in the HubSpot community that have a strong following in Boston or Silicon Valley. That makes sense given that Dharmesh is one of the most highlighted entrepreneurs in Boston and one of the biggest public faces of his company.

In order to identify the people in each subgroup that would be helpful in influencing Dharmesh and the people he follows, we analyzed thebetweenness and closeness centrality for each person in the graph. In this analysis, we sought to identify people who could influence Dharmesh, but would be accessible because they are not as popular and sought after as a mentor and investor. As an entrepreneur would, we inspected the highly ranked individuals to determine who would result in the best outcome. If the person were as popular as Dharmesh, it would not make sense to reach out to him or her.

Recommendations

To get in front of Dharmesh, it is important to look not only at the closeness centrality of the target contact, but also his or her role in the industry and how likely he or she would be able to connect you to Dharmesh. The person with the highest closeness centrality is actually an industry analyst for Altimeter Group, which is a research and advisory firm. Without additional knowledge, a target such as Jeremy Owyang would likely offer industry insight, but not potential contacts, since many would likely look to him for market research.

Instead, we would recommend trolling through the list ranked by closeness centrality and cross-referencing it with information that can be gathered elsewhere. With this approach, Jeremy Levine, David Hauser, Dan Abdinoor and Eric Paley seem to be the best targets. Jeremy Levine and David Hauser are both young entrepreneurs that Dharmesh follows that are deeply connected to the Boston startup community. While they have already established a reputation in the Boston VC community, they are likely more open to and available for meetings. Dan Abdinoor has worked at several startups in Boston, and is at the center of the HubSpot subgroup even though he is no longer employed by HubSpot. This is most likely due to the fact that he was one of the first ten employees hired and stayed through tremendous growth.

Potential Contacts that influence Dharmesh:

Dharmesh’s Twitter network graph:


Thanks to Jason Koster, Jake Berliner, and Seth Taft for letting me publish our analysis on my account here.

Good vs. Bad Retention — The User and Revenue Impact

I just published this piece on Medium, but am also cross posting to my blog. If you want to make sure you receive all of the content I put out, make sure to subscribe to my email list.


There are many things that set Facebook apart from your (or my) products. That said, there’s really one thing that it all boils down to: retention. Facebook has developed a product that people use indefinitely. The rest of us? We have a long way to go. What should you be doing to close the gap? Keep track of your retention numbers.

Most of the people I speak with have no idea how many people they expect to be using their product in a year, even though they are the ones ultimately responsible for the progress. If you do have some sort of goal, did you just pick a big hairy number? Did someone throw out a goal for you? If I could give you one piece of advice, it would be to build a simple model so you know what to expect.

After watching Phil Libin’s talk on retention and cohorts, I thought it would be interesting to model out what different types of retention look like for a SaaS product. What would it look like if you acquire the same number of users over time, but don’t hang onto them? What would it look like if you had really good retention? What are the tipping points for user growth? I built a couple of simple Excel models, and the graphs were quite shocking to me.

Let’s say you launch a new product, and as a good leader you track the people getting value from your product over time. Imagine it looks like this:

Congrats! You launched a new product to 1,000 users in January of 2016, and have grown it to over 8,000 monthly active users by the end of 2016. Your growth is slowing slightly, but you’re not too worried about it. Why should you be? You grew by 700% in 2016! That’s a cause for celebration.

Lets look at this graph in a slightly different way, by the cohorts of people who start using your product each month. In the example above, I assumed that 1,000 new people sign up for your service each month, and that some of them stop using it over time. Those people might find a different tool, unplug from the internet, or get a virus and blame your tool for the havoc it caused. Either way, of the 1,000 people who start each month, some of them quit using your product in the months after they sign up. This is what the active users chart looks like breaking down the cohorts over time:

In the chart above, the blue shape on the bottom represents the 1,000 people who signed up in January, and then how many of them are using it throughout the year. By December 2016, only 450 of them are still around. The cohorts “stack” on top of one another to produce your total active users in a given month.

If you develop a great product like Facebook or Uber, there’s some percentage of cohorts that use your product forever. They’re addicted to it. Even if they stop using it at some point, they come back. Facebook would have a hard time growing to 1.6 billion monthly active users if a lot of people used it once or twice and then never used it again.

Let’s see what happens to your growth if you weren’t like Facebook, and you didn’t hang onto your cohorts for a long time like Facebook. Let’s say you continue to have 1,000 people sign up every month, but over time those people end up quitting your product. This is what the chart looks like past 2016:

By the end of 2018, you’ll only have 13,000 users of your product. You had 11,700 people at the end of 2017! Even though you grew 700% in 2016, you only grew 11% in 2018. The rate at which you’re growing is slowing significantly, even though you continue to add 1,000 users a month. You can see this visually in the bottom right of the chart, where all of the cohorts seem to stack on top of one another, but don’t add up to anything. You can’t even tell the cohorts apart, they just look like a colorful set of stripes. By the end of 2018, the new people you’re adding every month are barely replacing the people who abandon your product from all of your previous cohorts.

What does it look like if you are able to build a product where 50% of your cohorts end up loving you product and sticking around for a long time. What would that look like? Let’s update our graphs:

Wow! Instead of 13,000 users, you will have 20,000 users by the end of 2018. You can see the big difference between the graphs. In the bottom right you have rectangles that build on top of one another. You overall growth rate is still decreasing (as a percentage of your install base), but your total number of active users continues to increase. In the previous example your growth had basically stalled, in this graph you are growing at a constant rate. The best products in the world retain a large percentage of cohorts over time, and the bars are a large percentage of the initial cohort size.

Up until this point, I’ve only been talking about retention of users. If you’re running a business you ultimately need to charge for your service (for example, a monthly subscription). Assume that a percentage of people will end up paying for your service, and that they slowly upgrade over time. If you can forecast how many people will be using your product, you should also be able to project how much money you’ll be making. Lets look at what your revenue looks like (again, broken down by cohort) when you have poor retention:

As your cohort sizes go to nothing, those people won’t keep paying for your product. This graph doesn’t look too bad, but what about if you look further into the future?

That doesn’t look good, you’re barely making any more money two years later. What about in the case where you have good retention? Assume that 10% of the long term users end up paying for your product, and they pay $50 / month. They don’t immediately upgrade — it happens slowly over time. What would that graph look like?

Holy crap! I like the slope of that line. In the bad retention example, you are making $15,000 / month in recurring revenue. What about in the good retention example? Over $80,000 / month.

Interested in playing with the different scenarios yourself? I uploaded my hypothetical data in an Excel file here, or in a Google Spreadsheet here. Google Spreadsheets is crappy for this kind of stuff, I’d recommend using Excel.

© 2025 Dan Wolchonok

Theme by Anders NorénUp ↑