A taboo yet neglected topic for the students at the University of Pennsylvania is contraception and the risk of unplanned pregnancy.

At the surface level, it seems these problems have been solved; female students at Penn have access to the best contraception options on the market through our beloved SHS (student health services). The gynos (gynaecologists) in SHS's Women's Health division are best known for their immediate availability for appointments. Essentially all Penn students choose to study and finish our four-year degrees through our late teens and early twenties, rather than have children.

Yet behind the career aspirations and the free condom giveaways, there are horror stories that students never talk about. It’s tacit knowledge that girls go to CVS to buy Plan B for their anxious friends who don’t want the purchase to show up on their credit card statement. Other women have had to travel to New York to get abortions. One couple has an accidental, but much loved, child being raised by the girl’s parents at home. Just because Penn students have access to the best contraception does not mean they use contraception consistently and correctly all the time. At some point, every individual and couple at Penn will have a pregnancy scare, whether that be from a condom breaking, or a period arriving a couple days late.

This data project explores national statistics on unplanned and teenage pregnancies in the United States, as well as reasons for contraceptive failures. As it happens, most unplanned pregnancies arise from contraceptive failures, and this is something that Penn students are not immune to. To address this, I also cite some of the leading academic research papers in the field of reproductive health, in order to better illuminate which methods people are most likely to use inconsistently and why, and how Penn students can make better contraceptive descriptions.

Though Penn may have lower rates of unplanned pregnancies than these national statistics, we could still all benefit from being aware of how errors in contraceptive use are likely to occur. We could also care more for our partners, and save each other the stress and the Plan B reality of CVS. As we will explore, the contraceptive industry itself is changing, and this can provide new opportunities for female and male students to have important conversations with their partners and pick which method is best for them.  

For my analyses, I accessed the data through open online datasets and academic journal articles, after which I cleaned the data in OpenRefine and Excel, and then visualized the data in Tableau. To ensure that my findings are relevant and relatable, I focus only on statistics from the United States.



The National Center for Health Statistics (NCHS), part of, provided me with numerous datasets on the subjects of unplanned and teenage pregnancy rates. Their "Births and Fertility Rates: United States" dataset looks at birth rates since 1909, which is a useful benchmark to give us context of teenage and unplanned pregnancy rates overall. I then looked at the "US and State Trends on Teen Births" dataset, to compare the rate of teenage pregnancies per state over time.

The US government's Pregnancy Risk Assessment Monitoring System (PRAMs), which is also run under the Reproductive Health division of the CDC, gathers a lot of state-specific surveys from thousands of women a year on their maternal experiences before, during and after pregnancy. However, their website was very convoluted and without a data key, the datasets were impossible to use.

Instead I decided to use "Unintended Pregnancy Rates at the State Level", published by an advocacy group,, in June 2011.


I use Google Trends data to analyze which forms of contraception were most frequently searched for in the US, and how their popularity in terms of searches has evolved since January 2010. Originally I planned to show trends since January 2004 when the datasets began, but there were too many month increments from 2004 to present, so I narrowed the range of years that I was looking at. The introduction of new products on the birth control market can be evaluated based on this data, where we see an increase in the number of searches for the Skyla and Liletta IUD (intra-uterine device) products, which were introduced to the market in 2013 and 2015 respectively.

I also use these trends to demonstrate how changes in search and popularity evolve over time, such as the fluctuations in the number of searches for "Planned Parenthood" compared with a more generic search for "birth control".


Thanks to the prevalence and high quality teaching on reproductive health at the Penn Nursing School, after speaking with Professor Dawn Durain, I was able to access numerous relevant research papers on the popularity and discontinuation rates of contraceptives. She directed me towards the National Health Statistics Reports, such as, "Contraceptive Methods Women have Ever Used: United States, 1982-2010", "Sexual Activity and Contraceptive Use Among Teenagers in the United States, 2011–2015", and the National Institute of Health (NIH)'s Public Access articles, such as, "Oral Contraceptive Discontinuation: Do Side Effects matter?". These reports survey a statistically significant number of women to try and understand why they might have discontinued a particular method of contraception. The discontinuation of a contraceptive method is what makes young people so at risk of having an unplanned pregnancy, and also signals where there may be user dissatisfaction and error in a method's use.

Converting these static data tables found in academic articles into workable data which I can use for visualizations was a tricky task. Luckily, I was able to use OpenRefine to separate values which had spaces between them into columns, and proceed.

Visualizations + Analysis

"Births and Fertility Rates: United States"

The first thing I wanted to visualize was simply to demonstrate that since 1909, the crude birth rate of the United States, or the number of live births occurring per 1,000 mid-year total population per year, has been decreasing. In spite of the "Baby Boom" of the 1950s, the crude birth rate has been trending downwards, and the most recent figure recorded was 2015's crude birth rate, at an all time low of 12.40. This indicates that fewer children in general are being born, which might skew our later findings in the changes in rates of teenage and unplanned pregnancies.

"US and State Trends on Teen Births"

Now turning to teenage pregnancy rates, we see in the figure below that since 1960 these rates have significantly declined, well in line with the general fall in the crude birth rate of the population. The area of each age group is relative to its size, indicating that births for women in their early teens (10-14) are significantly lower than women in their late teens (18-19), for example. Since 2010, the teenage pregnancy rate of women in their mid-teens (15-17) has declined significantly.

What I did not notice in creating this graph in Tableau was that there is a mistake in the labelling of the columns, and there are overlapping categories of data (ages 15-17, 15-19 and 18-19). This is puzzling because it also appears that 18-19 takes up more area than 15-19. As it transpires, the CDC measures both the overall average of the 15-19 age group, as well as these narrower age-range components. Ensuring the columns are appropriately labelled is important for the strength of these analyses, and a mistake that I hope not to make again.

This teen births dataset also had other demographic variables to introduce to the visualizations. If I was conducting a general linear model in R, I would be sure to test the significance of race as a variable, as we can see various disparities below.

What is most surprising about the separation of teenage birth rates by race is the sheer variation between races. Here we are specifically measuring the "rate per 1,000 females aged 15-19 based on a three-year average" (

These graphs also reflect our previous findings that teenage pregnancy rates are decreasing. Teenage pregnancy rates are lowest in Non-Hispanic White populations and Asian or Pacific Islander populations. They are highest among Non-Hispanic Black populations and Hispanic populations. Yet in all race categories, we see that the teenage birth rate is decreasing, most significantly in the Non-Hispanic Black population.

"Unintended Pregnancy Rates at the State Level"

The population of each state clearly influences its number of unintended pregnancies for women aged 15-44, as shown in this map of the Lower 48 States. Drawing on the Guttmacher Institute's data from 2006, we can see from the colors that California, Texas, Florida and New York stand out as having high numbers of unplanned pregnancies. It might have been better to classify states into "Higher", "Medium" and "Lower" rates, because the color difference between Tennessee (70,000 unplanned pregnancies) and North Carolina (106,000 unplanned pregnancies) is minimal, even though 36,000 unplanned pregnancies sounds like a big difference to me!

I then decided to sort the number of unintended pregnancies into an ordered bar chart, which is hopefully readable and has the full values for each state for us to compare. Here we see just that Vermont and Wyoming have about one hundredth of the numbers of unplanned pregnancies as California. Additionally, we have the added benefit of being able to compare data from Hawaii (17,000) and Alaska (8,000).

Unfortunately, this data on unplanned pregnancies does not give any indication of the breakdown by age, because it measures all child-bearing women aged 15-44. Therefore, it cannot help us with evaluating the relative risk of unplanned pregnancies for the teenage women and women in their early twenties population, like the students at Penn. However, we may still be able to look at other variables, which might tell us more about which women are most at risk across the country.

An important variable that I wanted to explore which I also found in the Guttmacher data is the percentage of pregnancies that were unplanned or unintended, by state. This information, I hoped, would give us more of a sense of the health disparities between states, and might directly indicate where education about contraceptives is lacking. According to this data, 65% (!) of pregnancies are unplanned in Mississippi. While that percentage may sound astronomically high, it is high for many states overall. While this map below does help us understand this, in future it might be better for me to visualize it using little pie charts on the map with the relevant percentages labelled, to give a clearer indication that these values are percentages.

Though states below the Mason Dixon line may have very different political leanings and methods of education on contraception, their rates are not that much higher than those of midwestern states like Ohio and Illinois. Still, visually there is a slight leaning towards Southern states being represented as darker, indicating generally higher rates of unplanned pregnancies. For example, Colorado and Kansas are above the Mason Dixon Line and both have 48% of pregnancies unplanned, whereas New Mexico, Texas and Oklahoma right below them have 56%, 53% and 53% respectively. Even a 5% different in rates of unplanned pregnancy can have a massive impact on the lives of women and their children, and indicate many things on the reproductive health and contraception education level of the population.

Reading from the chart below, eleven states and D.C. have unintended pregnancy rates at 60% or over. Most states have unplanned pregnancy rates at over 50%. These numbers are close to insanity: 50% of the babies conceived in the U.S. are unintended. The prevalence of high rates of unintended pregnancy across states signifies a lot of the problems with the way that contraception and birth control is taught, used, and misused by couples. In some ways, this rate across states is surprisingly uniform (i.e. high!).

On the other hand, the percentages of unintended pregnancies which end in abortions, across states, as a statistic is far more indicative of local politics, which I've mapped in the bar chart below.

We find that New York, D.C. (not a state) and Connecticut, as well as other North Eastern states have higher rates of abortions for unintended pregnancies, at 59%, 64% and 54% respectively. Interestingly, Pennsylvania is not in the upper quartile of states for this statistic, with only 33% of unintended pregnancies ending in abortions. Alternatively, states harbouring stronger religious communities, such as Mormons in Utah and Idaho who are against abortions, evidently have a lower percentage, with 15% and 21%.

State government policies are likely to have a big impact on whether unintended pregnancies can end in abortions, as we can see with South Dakota's mere 13% of unintended pregnancies ending in abortions. A  low figure like this prompts some more research into state governmental explanations for this lack of access.

According to NPR in their October 2017 article "For Many Women, the Nearest Abortion Provider is Hundreds of Miles Away", abortion clinics are sparse in South Dakota, the waiting period is 72 hours for abortions, and state law requires women to meet with their doctor in advance of the procedure. To address this, I took a screenshot of the "Travel Distance to an Abortion Provider in 2014" (click the image for the link), where we can evidently see that much of South Dakota is more than 180 miles away from an abortion clinic.

I gathered data from Google Trends, which map the frequency and popularity of Google searches based on geography over time. I wanted to see which forms of birth control were most often searched for among internet users in the United States. Whilst Google Trends data does not tell us much about who is searching for this information, the search rates can give us some indication of the general public's relative awareness of certain forms of birth control, and how this awareness is changing.

Below, I explore the differences in popularity for various key searches in birth control in the United States, between "birth control", "plan b" ( the most common brand of emergency contraception), "the pill" and "condom" since 2010. What we see is that with these contraceptive methods that have been around for some time, there is a pretty consistent level on interest in these terms over time. To get a more detailed look at the reasons for the fluctuations in the birth control term searches, we could record the dates of the spikes, and then try to understand what might have caused it, such as in March 2012 and July 2014 (see below).

The "birth control" search is much higher than the other more specific methods, which suggests that people might search for the generic term "birth control" and subsequently be directed towards learning about the many different methods. If I were to improve this graph in the future, I would find some way to make the dates across the bottom more comprehensible and relevant, just as Google has managed to do on their trends data page.

Whilst the graph above does not seem all that politicized, you don't have to go far in this contraception and reproductive healthcare data to find evidence of politics' influence on the public's interest and understanding about birth control. For example, below I compare the relative frequency of searches for "birth control" as opposed to "Planned Parenthood" in the United States. Evidently, we see two giants spikes in Google searches for Planned Parenthood, one during September 2015 and another in January 2017. The Congressional Hearing on Planned Parenthood's "baby parts" scandal took place in September 2015, and received much attention in the press, including the Washington Post's article, "5 moments when Congress's Planned Parenthood hearing got heated."

Whilst the popularity of older, more established forms of birth control has been relatively constant as we saw above, in recent years a number of new IUD (intra-uterine device) contraceptive products have been released on the market. The "Liletta" (shown in yellow below) was introduced in 2015, and the "Skyla" IUD (not to be confused with searches for the Skyla Pokemon Card), was introduced in 2013. These two brands, along with Mirena and Liletta, make up the four brands currently on the market. We see that, based on the area, that a generic search for "iud" in the United States is not as common, as the smaller proportion of red on the graph below indicates.

Naturally, there has been a large increase in total searches for IUD products for the last eight years or so, and this reflects our general understanding in the birth control space that the IUDs are increasing in popularity, especially among younger women. For academics and industry professionals in the reproductive health space, having access to a dashboard of the changing popularity of Google Trends searches might be useful in order to stay up with perceptions of products among young women, or perhaps the data could be aggregated from Twitter.

"Contraceptive Methods Women have Ever Used: United States, 1982-2010"

Next, we move on to discontinuation rates and reasons for dissatisfaction with other contraceptives products. Like in any data analysis, here we are limited to analyzing which products the academics measured for the study, here being the Patch, the Pill, Depo-Provera (injectables) and the Condom. There is no data on IUDs, the implant, or female sterilization, which are the more common long-active reversible contraceptives (LARCs).

The article, "Contraceptive Methods Women have Ever Used: United States, 1982-2010" published by the National Health Statistics Reports, provides us with some key data tables on a statistically significant number of women's experiences with various birth control methods. On Page 14 of this study, I copied the information into the data frames below from a table titled, "Table 5. Number of women, aged 15-44 who have ever used the selected contraceptive method, percentage and number who used and discontinued the method, and reasons for discontinuation: United States, 2006-2010."

The effectiveness of contraception for a couple is dependent upon correct and consistent usage over time, a rule that Penn students must follow too. To reiterate, contraceptive failures make up about 48% of unplanned pregnancies, and discontinuation of a contraceptive method is a large part of this failure, making women particularly vulnerable as they are not using the methods consistently.

In the graph above on the left, we see that of the women surveyed about discontinuation, the pill was by far the most common method being used that was discontinued, and the patch is the least common. However, from the graph above on the right, we see that the patch and the Depo-Provera injectables had the highest discontinuation rates. Having a discontinuation rate above 30% is still really very high, meaning that, for example, one third of pill users will discontinue using the pill. This study does not address the continuation rates of the IUD, but it has been estimated that they are much lower, at 15% (Contraceptive Technology, 20th Edition).

These sorts of figures are most relevant to Penn students, because many female Penn students take the pill, and still more, even consensual couple at Penn needs a birth control method they can rely on for a long period of time. For this reason, it is important to understand what discontinuation of a contraceptive method means, and why a women might be likely to discontinue a method.

As we see below, the most common reason for discontinuation is "You had side effects" i.e. the female survey respondent experience bad enough side effects to stop using the contraceptive. More than 60% of pill users and 60% of the Depo-Provera users who discontinued did so because of side effects. If women, and female Penn students for that matter, are going to continue to use contraceptive methods to avoid unplanned pregnancies, they must be sure to pick a method which has minimal negative side effects. As it happens, there is very little gynaecologists can do to determine how each women will react in terms of side effects to each birth control method.  

Another interesting point to note is that about 10% of pill users and 10% of patch users responded, "The method failed, you became pregnant", as their reason for discontinuation! A smaller proportion of these respondents were using Depo-Provera (which has a lower rate of human error because it is injected every three months). Though Depo-Provera's main reason for discontinuation is side effects (60% of those who discontinued), more attention could be given in another analysis assignment to the rates of effective use, because it seems to have caused relatively fewer pregnancies, though this study is small.

Additionally, from this survey, we can gather some general sentiments about condoms; about 40% of women who were surveyed as using condoms but discontinued said it was because "Your partner did not like it"; and the same percentage said they stopped using condoms because it "Decreased your sexual pleasure". Still, these facts are not as troubling as they might seem, because condoms had the lowest discontinuation rates of the four contraceptive methods surveyed, with only a 10.8% discontinuation rate. Even if sex is not as enjoyable for the couple who were surveyed, at least many people who were surveyed continue to use condoms to have safe sex.


While many conclusions can be drawn from the analyses above, what is most relevant to Penn students is that couples must invest in finding a birth control method that does not give the female student terrible side effects to the point where she might discontinue the method. As we have seen from the broader national statistics, at least 50% of pregnancies in most states were unintended, which is really an astronomical figure. While in Pennsylvania we might not be as disadvantaged as the women in South Dakota when it comes to accessing an abortion clinic, because the abortion rate of unplanned pregnancies is only 33% in Pennsylvania, we still need to be aware that we really do want to avoid unplanned pregnancies as much as possible. This requires couples to speak openly and honestly about their use of contraception, and how they intend to avoid unplanned pregnancies together. These findings also apply to male students, who are equally as uninterested in becoming young fathers, and who otherwise may feel resigned to having less enjoyable sex using condoms.

Assuming that students are sexually active, the only way to do avoid unplanned pregnancies is to ensure that female Penn students find the best birth control method that works for them, that they can use consistently and correctly throughout their college career.