Fun with Excel #13 – The Laws of Attraction III: Two Inches Taller

September 27, 2016October 7, 2016 Jeffdating, fun with excel, laws of attractionLeave a comment

An interesting discussion came up at work the other day, when a co-worker mentioned that a friend of hers (a guy) had opted to undergo an expensive and painful surgery to gain a few inches in height. While the consensus seemed to be that such an operation would ultimately not be worth it, I must admit that I entertained the idea briefly…or at least long enough to run a few numbers for my own curiosity.

Most people have probably heard of the phrase “tall, dark, and handsome” being used to describe the physical attributes of an attractive male. While it’s certainly a generalization, there’s little doubt that when it comes to traditional heterosexual relationships, women tend to prefer taller men. Although attraction is often the result of a variety of factors, I wanted to focus on just one attribute (height) and quantify the impact of being two inches taller on a man’s romantic prospects.

Assumptions

I started with a sample population of 200 million people (100 million men and 100 million women). I then assumed that the heights for both men and women were normally distributed: “adult male heights are on average 70 inches (5’10”) with a standard deviation of 4 inches. Adult women are on average a bit shorter and less variable in height with a mean height of 65 inches (5’5”) and standard deviation of 3.5 inches.” Although relationship matching is a two-sided problem, I assumed for the purposes of this exercise that only women’s preferences counted (and that every woman shared the same preferences), while men were assumed to be indifferent.

For example, if a 5’5” woman preferred men who were at least her height but not more than 12 inches taller than her, then her potential prospects would include all men between 5’5” and 6’5” (85,429,107 men, or 85.4% of the male population). Conversely, a 5’10” man’s prospects would include all women between 4’10” and 5’10” (90,068,614 women, or 90.1% of the female population).

As always, my data is available here.

From a Woman’s Perspective…

I first examined the impact of women’s preferences on the number of potential men that they would be attracted to. The bold black and red lines represent the number of males and females (left y-axis) distributed by height (x-axis). The dotted blue, green, and purple lines represent the number of potential male prospects (right y-axis) for a given female’s height. The three colors represent three different preference cases:

Case A: All females prefer between her height (0” lower bound) and 12 inches taller (+12” upper bound). This is our base case
Case B: All females prefer between 3 inches shorter (-3”) and 12 inches taller (+12”)
Case C: All females prefer between 6 inches shorter (-6”) and 12 inches taller (+12”)

Of course, there are a multitude of different preferences that we could test, but I mainly wanted to see what would happen if women became more receptive to dating shorter men.

The total number of potential men increases from Case A to Case B to Case C. This should surprise no one, as the total preference range is expanding in each case. What should also be intuitive is that while every woman is better off (which we will define as having a greater number of potential male prospects), how much each woman benefits is dependent on her height. Because we are pushing out the lower bound of women’s preferences from 0” to -3” to -6” in the three cases, it makes sense that doing so disproportionately benefits taller women. This is represented graphically by the right tails of the distributions shifting out (from blue to green to purple) while the left tails remain anchored.

It’s worth noting that even a woman of average height (5’5”) would experience notable benefits from the expansion of lower bound preferences. In Case A, her potential prospects would include 85.4% of the male population, but that figure jumps to 93.7% in Case B and 95.7% in Case C. However, the same shifts in lower bound preferences would be a complete game-changer for taller women. A 5’8.5” woman (1 SD above the mean) would see her potential prospects increase from 64.2% in Case A to 86.5% in Case B and 96.5% in Case C. Lastly, the “ideal height” (i.e. the height at which a woman would have the most potential prospects) in each of the 3 scenarios would be 5’4” in Case A (86.6% of male population), 5’5.5” in Case B (93.9%), and 5’7” in Case C (97.6%).

From a Man’s Perspective…

The bold black and red lines remain unchanged, but the dotted blue, green, and purple lines now represent the number of potential females prospects for a given male’s height in Case A, Case B, and Case C, respectively.

The dotted lines are a mirror image of the ones in the first chart. The total number of potential women increases from Case A to Case B to Case C. Everyone man is better off, but shorter men are benefiting disproportionately. This is represented graphically by the left tails of the distributions shifting out (from blue to green to purple) while the right tails remain anchored.

A man of average height (5’10”) would experience benefits from the women expanding their lower bound preferences, but these benefits are not as significant as the gains that the average woman would experience. This is due to the fact that men are on average taller than women and that women tend to prefer taller men. In Case A, the average man’s potential prospects would include 90.1% of the female population (already very high), and that figure increases to 96.6% in Case B and 97.6% in Case C. The relative gains from Case A to Case B to Case C (6.5% and 1.0%) are indeed less than the gains of the average woman (8.3% and 2.0%).

For the same reasons, women changing their lower bound preferences would be an even greater game-changer for shorter men than it would be for taller women. A 5’6” man (1 SD below the mean) would see his potential prospects increase from 61.2% in Case A to 87.3% in Case B and 97.6% in Case C (gains of 26.1% and 10.3%, versus 22.3% and 10.0% for the tall women 1 SD above the mean). Lastly, the “ideal height” (i.e. the height at which a man would have the most potential prospects) in each of the 3 scenarios would be 5’11” in Case A (91.4% of female population), 5’9.5” in Case B (96.8%), and 5’8” in Case C (99.0%). Just to hammer the point home, notice how these peak percentages are higher than those of their female counterparts (86.6%, 93.9%, and 97.6%). Lastly, I’d like to point out that in our base case preference scenario (Case A), the ideal heights are 5’4” for women and 5’11” for men, which is very close to their average heights of 5’5” and 5’10”.

The Impact of Two Inches of Height

All this brings us to our main question: as a man, how much is it worth to be two inches taller? At this point, it should surprise no one that the answer is “it depends” (doesn’t it always?). Given everything that we’ve learned so far, we should expect shorter men to benefit more from a potential height increase than taller men.

We see that this is in fact the case in the chart below.

The dashed lines measure the percentage increase (right y-axis) in the number of potential female prospects for a man of given height (x-axis). These dashed distributions resemble exponential decay, with extremely short men benefiting significantly from being two inches taller, but with those benefits rapidly diminishing as men approach 5’6” and taller. The shapes of these distributions are also impacted by the female preference case. Case A shows the steepest decline while Case C shows the shallowest. This should make intuitive sense, because Case A is the least tolerant preference range while Case C is the most tolerant.

Lastly, if we zoom in on the impact of being two inches taller on only men who are 5′ and taller, then we can get a clearer sense of where the breakeven height is for each preference case. Here, we define breakeven as the height where a man receives no benefit from being two inches taller. Any man taller than the breakeven height would in fact be worse off if he received the height increase of two inches.

In the base case preference case (Case A), the breakeven male height is 5’10”, which is exactly the average male height! In other words, a 5’10” man and a 6′ man have the same number of female prospects, which ties in beautifully to our earlier conclusion that 5’11” is the ideal male height in Case A (5’10” and 6′ are simply symmetric points on a distribution where 5’11” is the peak). By the same analysis, the breakeven male height is 5’8.5” in Case B and 5’7” in Case C.

The fact that the breakeven male height decreases as the female preference range increases (from Case A to Case B to Case C) is significant because it very clearly shows that there are two ways to solve the attraction disadvantage that shorter men face: (1) make them taller or (2) get women to change their preferences. The former is a solution that comes at a steep physical and financial cost (to an individual), while the latter is more of a mental and cultural challenge (that may in fact be more difficult to achieve as a society).

As a 5’7” man, I find it amusing that I would boost my potential prospects by 21% if I were two inches taller in Case A, while that figure would shrink to only 5% in Case B and 0% in Case C.

So ladies, please show some love for shorter men! It will make our decision to not get height enhancement surgery that much easier.

Fun with Excel #4 – The Laws of Attraction II

October 26, 2013 Jeffdating, fun with excel, laws of attraction1 Comment

As some of you may recall, I kicked off the Fun with Excel series with a post on attraction, where I hoped to explore the mechanics of physical attraction from a statistical perspective. Due to the amount of feedback I have received (positive, skeptical, or otherwise), I have decided to write a follow-up post.

In the first part of The Laws of Attraction, I focused solely on physical attraction, and the impact of bias in our perception of attractiveness on seeking a compatible partner. In part two, I focus on the bigger picture: given a set of personal traits, what is the probability that you will find someone with those traits at the specific level that you desire?

Background: In the song One In A Million, Ne-Yo sings about a girl who he calls “one in a million.” Of course, not content with just enjoying the music, I wondered to myself what it actually meant to be “one in a million.” One way of measuring this is by breaking down attraction into a larger set of personality traits and trying quantify our desires, which is essentially what online dating services do with their “matching formulas.” For purposes of our exercise, let’s say you have a list of 10 distinct characteristics that you believe to be important and that you actively look for when searching for a partner. You might be more picky on some traits than others, but it isn’t too hard to quantify your objectives. Similar to my previous project, I quantify these objectives in terms of percentile, which, at least from a guy’s perspective, is pretty straightforward. For example, I might say, “I’m only interested in a girl who’s in the 80% percentile for Trait 1, 90th percentile for Trait 2, 50th percentile for Trait 3…” and so on and so forth. Now, the question is “what are the chances that such a girl exists?” A closely related question is “how many such girls are out there?”, followed by the not-so-fun reality-check of “what are the odds that I’ll actually find such a girl?”

The Model: While we won’t tackle the last question in this post, the first two are pretty straightforward to simulate from a mathematical standpoint. For each trait, the probability of finding someone who is at the X-percentile or higher of that trait is (100-X)%. For multiple traits, all we have to do is multiply these probabilities together, but the key assumption here is that all the traits are independent. Obviously, this isn’t true in real life, but we’ll revisit this point in a little while.

Assuming we start with a set of 10 traits, I will define a person having N “Perfect” Traits as someone who ranks at the 90th percentile or higher in N traits, and at the 50th percentile or higher in the remaining 10 minus N traits. Thus, assuming a world population of 7.12 billion, a male/female split of 50/50, and that you are heterosexual, the number of potential partners with 0 “Perfect” Traits walking on the planet is 3,476,563, or 1 in 1,024 (the mathematically inclined should immediately realize that 1,024 = 2^10). On the other hand of the spectrum, there are theoretically only 18 people with 9 “Perfect” Traits, or 1 in 200 million. Note that a person with 10 “Perfect” Traits technically doesn’t exist, as probability indicates a 1 in 10 billion chance. At this point, the astute reader will note one possible answer to Ne-Yo’s earlier problem: if you consider a smaller set of 6 traits rather than 10, a “one in a million” girl would simply be a girl who has 6 “Perfect” Traits (all 6 traits at the 90th percentile or higher) in that scenario.

The Results: I plotted the entire spectrum of N “Perfect” Traits in the scenario of 10 traits, to arrive at the following graph:

It should be no surprise that our graph strongly resembles a normal curve, as we are working with a binomial distribution.

I suppose the lesson here is that it doesn’t pay to be picky, but recall the very important (and incorrect) assumption we made earlier that all traits occur independently of one another. In the real world, however, this couldn’t be further from the truth. Creativity may be correlated with Curiosity, Honor may be correlated with Kindness, and Intelligence may be highly correlated with (or the cause of) all the other traits. Accounting for the dependencies between and among all 10 traits would require us to estimate both marginal and conditional probabilities, which would not only be difficult, but also complicate our model very quickly. Statistical mumbo jumbo aside, what this means is that the probabilities estimated by a simple binomial model are far too conservative (too low). This should be great news for all the picky daters out there.

An alternative way of tackling the dependent traits problem is to simply consider a smaller set of traits. For example, if we created a list of 10 traits, and then realized that two of them were very highly correlated with each other, then we could eliminate one of them and simply consider a 9 trait model, which in turn would be a more accurate simulation of what the actual probabilities might look like in real life. To that point, I also plotted out graphs for scenarios involving 7 traits and 5 traits:

Note that as we decrease the number of traits, the number of potential candidates increases exponentially. So if you only considered 5 main traits, and furthermore were only picky about 3 of them (3 “Perfect” Traits in the graph above), then you would only be looking at a probability of 1 in 400. Not bad.

Conclusion

At the end of the day, it is perhaps silly to attempt to model real life human dynamics with 50 lines in Excel. But that would also be missing the point of the exercise. Thinking about real world problems from a different perspective (whether it is psychologically, statistically, or otherwise) can shed new light on the issue, or simply affirm something we already knew or suspected. Even if it is only the latter, there is still value derived from being able to connect the dots between a variety of different frameworks.

As for me, my dream girl in the 10 Trait model is about 1 in 5,925,926, and about 1 in 53,333 in the 5 Trait model. I’m not sure if I’ll ever find her, but it’s satisfying to know that she’s out there.

-J

Fun with Excel #1 – The Laws of Attraction

July 18, 2013July 18, 2013 Jefffun with excel, laws of attractionLeave a comment

This post is dedicated to my Dad, whose lifelong passion for learning has been an inspiration for my own never-ending pursuit of excellence and the truth. Happy birthday Pops!

—

Disclaimer: This post is the first of (hopefully) many in a new series called “Fun with Excel,” where I use Microsoft Excel to model out and explore interesting real world topics.

—

This week, I explore the topic of physical attraction from a statistical perspective.

Okay, I admit I was very tempted to insert a provocative picture of <Name of Hot Actress/Model>, but that would have been trying too hard. So what is physical attraction and how does it work? Again, I want to reiterate the fact that I am talking about physical attraction (aka Hot or Not), none of this lovey-dovey emotional stuff. So I don’t want to read a comment later that says, “But Jeff, you didn’t incorporate personality into model!” No, I didn’t, and that was on purpose.

Background: For starters, an assumption: attractiveness is (mostly) objective. Sure, we’ve all heard the phase “beauty is in the eye of the beholder,” and this saying certainly holds merit. Ask a group of men to rank a group of women by attractiveness (or vice versa), and it is highly unlikely that you will get two identical rankings. However, the correlation between the rankings should be statistically significant. Height, skin complexion, body proportion are just three of many physical traits that play a role in defining a person’s “objective” attractiveness. The ancient Greeks figured this out millenniums ago, but in case you’re not convinced, here’s a short expert from Malcolm Gladwell’s Blink. So if attractiveness is indeed objective, it seems reasonable that we can also assume that is is normally distributed. Data collected from the popular dating site OkCupid seems to suggest that this could be the case:

Ignoring the message distribution lines for a moment, we notice that while males rate females on a normal distribution, women seem to rate males on a log-normal distribution. Ouch. So does this mean that most men are just ugly? Not quite. Remember, both males and females are ranking the opposite sex based on their perceptions of attractiveness. But if we know that attractiveness is objective, what might cause the discrepancy between the perceived log-normal distribution and the actual normal distribution? One likely explanation is superiority bias, which is psychology speak for narcissism. Superiority bias states that humans tend to overestimate their positive qualities and underestimate their negative ones. If that sounds familiar, it’s because it is. Superiority bias is documented in almost everything we do, from our perception of our own intelligence to our driving skills (oh God). However, the superiority bias is nothing more than an illusion. 80% of people might rate themselves above average on driving skills, but this is a logical fallacy. By definition, 50% of the population must be above average drivers. The same principle should hold true when it comes to beauty: half of the population is above the average attractiveness level, while half is below. Clear? Ok, let’s move on to the model.

The Model: Throughout the model, I thought about attractiveness on a percentile basis rather than on a raw scale (1-10). Although these methodologies should theoretically yield the same results, it is often more natural for people to think on a linear scale. However, this tendency actually has the impact of embedding our biases into our ratings, causing us to be less objective for the reasons stated above. I found that thinking about things on a percentile basis forces us to consider the situation from a more objective perspective. Rather than ask “Is this person a 9.5 (out of 10)?” which leads us to question what a “9.5” constitutes in the first place, we can ask “Is this person 3 standard deviations above the mean?” The latter has an inherent meaning, namely, if you put this person in a room of 1,000 people is he/she the most attractive person in the room? In addition to assuming a normal distribution for the attractiveness of both men and women, I gave each group two additional characteristics: superiority bias (%) and seek range (%).

I incorporated superiority bias by applying it as a scaling factor for how people perceived their own attractiveness. For example, if you have a true attractiveness-percentile (a-perc) of 50% (i.e. you’re average) and a superiority bias of 0%, then you would perceive yourself as also having an a-perc of 50%. However, if you had a superiority bias of 20%, then you would perceive yourself as having an a-perc of 70%.

Seek range refers to how wide a person looks when looking for a potential partner. There are a few important things to be noted about how the seek range is actually incorporated into the model. First, the seek range is based on one’s perceived a-perc and not their true a-perc. Think about it. If we believe we are more attractive than we actually are, then it makes sense that we would attempt to seek out other people whom we believe to be around the same attractiveness. So returning to our example, if you have a true a-perc of 50% and a superiority bias of 20%, you would perceive your a-perc to be 70%. If you also had a seek range of 20%, you would look for potential partners with a true a-perc between 60% and 80% (I assume for simplicity that people will seek both upwards and downwards equally, except in boundary cases). The rationale for using true a-perc here rather than perceived a-perc is the observation that other people tend to perceive us more objectively than we do ourselves. In other words, superiority bias is something that affects your own perception and not the judgment of others.

By the default, the model assumes that the “seeker” is a male who is looking for a “target” female (it is quite easy to change this if desired). Furthermore the model has the option to customize the superiority bias and seek range of the male and female populations independently.

The Goal: Given a set of assumptions for the 4 input variables (2 biases and 2 seek ranges), the model includes a macro that iterates the seeker’s true a-perc from ~0% to ~100%, returning the compatibility range, which is the range of targets that is also interested in seeker. Remember that while attraction can be one or two-sided, we are only interested in how changing the input variables will impact the area of mutual attraction. Building on our example from earlier, recall that the seeker is a male with a true a-perc of 50% and a superiority bias of 20% (and therefore a perceived a-perc of 70%). With a seek range of 20%, he is looking for females with a true a-perc between 60% and 80%. Conversely, assuming that females also have a bias of 20% and range of 20%, we can back-solve to figure out that the set of females interested in the seeker have a self-perceived a-perc between 40% and 60% (don’t continue reading until this makes sense to you). This in turn corresponds to the set of females with a true a-perc between 28.6% and 42.9% (by reversing the superiority bias). However, recall that the seeker is only interested in females with a true a-perc between 60% and 80%. So it is obvious that in this case that the compatibility range is 0%, and the seeker goes home unhappy to eat his bowl of ramen noodles and cry himself to sleep.

The Results: I first explored the impact of the magnitude of the superiority bias by keeping the bias assumptions symmetrical. Here are the results (click on the charts to see the original image size):

The base case where the superiority biases = 0% paints an interesting picture of what happens at the two extremes. Due to the way that seek range is incorporated in the model, once a-perc reaches either the low-end or high-end, the seek range becomes asymmetrical since the range itself remains the same at all points. I won’t delve too deeply into the mathematical analysis of why the lines look exactly the way they do, but intuitively these results should make sense. People with very low a-percs have a smaller compatibility range since fewer people are interested in them, while the middle of the pack flattens out as expected. People at (and slightly above) the 80% level receive a wide range of interest from the opposite sex, but their compatibility range is still capped at their own seek range of 20%. Lastly, people with very high a-percs also experience a smaller compatibility range, due to the fact that there are simply fewer people pursuing them (more on this later).

Things get interesting as we increase the superiority biases. The middle of the curve becomes more V-like as the bias increases, until the whole curve becomes very bimodal at the 15% and 20% bias levels. In other words, significant superiority bias has a very disproportional negative impact on those of average attractiveness. Due to both their own bias and a symmetrical bias in their targets, these Average Joes will aim for women who won’t be interested in them. Similarly, the range of women who are interested in the Average Joe are below his seek range. In the scenario where both males and females hold a superiority bias of 20%, half of the men (with a-percs between 25% and 75%) end up with a compatibility range of 0%. Now, before you raise your hand and point out that 20% is a very high value to assign a superiority bias, ask yourself this: given a roomful of 100 of your peers, would you rank yourself in the top 30 in terms of attractiveness? If this doesn’t seem entirely ridiculous, then your superiority bias may be larger than you thought. Given the somewhat bleak picture painted by Chart 1, should we all just give up on love and dating if our chances of being attracted to someone who also happens to be attracted to us is so low?

Luckily, no. For one, the assumption that superiority bias is symmetric might not be correct. Remember the two OkCupid charts above, which seem to suggest that males perceive female attractiveness normally while females perceive male attractiveness log-normally? Well, one way to actually incorporate this discrepancy into our model is by making the superiority biases asymmetric. Thus, if we accept the findings of the OkCupid study to be valid for the general population, then we should give women a larger superiority bias than men.

In Chart 2, I’ve kept the female superiority bias constant at 20% for all the plots, while changing the male bias from 20% to 0%. Note that this has the impact of skewing the V-shape part of the plot to the right, while the boundary cases remain unchanged. From these plots, we see quite clearly that even if we make a conscious effort to reduce our superiority bias or even remove it entirely, it doesn’t get us too far if the other side doesn’t reciprocate. So now what?

There are still two things we haven’t considered. As Chart 3 demonstrates, increasing the seek range can in fact compensate for a high superiority bias in the opposite sex (compare the red plot to the orange one). However, note that the vast majority of the benefits resulting from increasing the seek range end up going to the high-end (those with high a-scores), with less impact on the lower end of the scale and almost no impact on the middle section. Finally, we must remember that people’s preferences (and even those of the entire population) can change over time. As we grow older, we gain a better understanding of ourselves as well as a better sense of what kind of partner we’re looking for. This may cause us to lower our own superiority bias while either increasing or decreasing our seek range. For example, the black plot in Chart 3 is my best guess at what the dating scene might look like around the age of 28-35, where most people (and perhaps women more so than men) are looking to get married. This plot looks more along the lines of the black “zero-bias” plot in Chart 1, and features a less serious V-shape, which means there is hope yet for our Average Joe 🙂

So What?

Statistics are interesting and modeling is fun (right Troy?). Our model seems to do a relatively decent job of demonstrating how attraction works, but it’s all somewhat meaningless if we can’t draw some larger conclusions to the real world. To that end, I offer the following points:

Be more flexible. Remember that the upper limit of your success will always be your seek range. It doesn’t matter if you are in the 90th or 10th percentile of attractiveness. If you only search within a 5% range for potential partners, you’re compatibility range is at most 5% as well! Although this point may seem obvious, it reinforces the idea that people of average attractiveness (particularly within 1 SD of the mean) should broaden their seek range in order to increase their chances of success. (Note that increasing your compatibility range is not the same as increasing your utility…but this topic is best left for another day).
It doesn’t hurt to aim high. Take advantage of the fact that people with very high a-percs actually have a smaller compatibility range. This is because there simply aren’t enough people who are able to pursue the high a-percs, since you yourself would need to have a very high a-perc in order for the very top echelons to fall within your seek range. You can differentiate from the crowd by either lowering your superiority bias or increasing your seek range (or both), and doing so will increase your probability of success.
Patience can pay off. After all, attraction involves two parties, and as our model has shown, you can only do so much on your side of the equation to impact the overall reality. However, people are neither homogeneous nor static in their preferences, both of which are fundamental assumptions in the model. So even if things aren’t working out at the moment, don’t give up, because they eventually will in the end.

—

Phew! What a journey. When the idea for the first topic in the Fun with Excel series popped into my head, I didn’t expect to end up this deep in the weeds. The Excel model was a product of a couple days’ thought process, and several hours of actually building out the model and stress testing it. My first version included a particularly nasty macro to continuously automate Excel’s goal seek function, but luckily I was able to figure out a way to reverse the implementation of the superiority bias using mathematics, which sped up the process of actually generating results. If you’re interested, you can take a look at the model here. You are free to play around with it as you like, but if you plan to use it or modify it for academic, commercial, or any other purpose that involves publication, I ask you to please provide the proper attribution.

I thoroughly enjoyed working on this project, and welcome all your questions and comments below. If you have any suggestions of future topics I could pursue in the Fun with Excel series, please let me know!

-J

Jeffrey Fan

Random Musings of an Amateur Data Scientist

laws of attraction

Fun with Excel #13 – The Laws of Attraction III: Two Inches Taller

Fun with Excel #4 – The Laws of Attraction II

Fun with Excel #1 – The Laws of Attraction