From Kaggle:
The Global Terrorism Database (GTD) is an open-source database including information on terrorist attacks around the world from 1970 through 2015 (with annual updates planned for the future). The GTD includes systematic data on domestic as well as international terrorist incidents that have occurred during this time period and now includes more than 150,000 cases. The database is maintained by researchers at the National Consortium for the Study of Terrorism and Responses to Terrorism (START), headquartered at the University of Maryland.
Geography: Worldwide
Time period: 1970-2015, except 1993 (2016 in progress, publication expected June 2017)
Unit of analysis: Attack
Variables: >100 variables on location, tactics, perpetrators, targets, and outcomes
Sources: Unclassified media articles (Note: Please interpret changes over time with caution. Global patterns are driven by diverse trends in particular regions, and data collection is influenced by fluctuations in access to media coverage over both time and place.)
Definition of terrorism:
"The threatened or actual use of illegal force and violence by a non-state actor to attain a political, economic, religious, or social goal through fear, coercion, or intimidation."
You will need to make use of the codebook that functions as a data dictionary.
Moreover, this 2007 paper by LaFree (UMD) introduces the dataset very well.
You must download the 28MB zip file, unzip it, and work with the 128MB csv on your machine.
In the process of maintaining the GTD, you'll note that the year 1993 has zero recorded instances. Due to the many actors maintaining the dataset, this year has been lost.
Your goal is to impute the number of bombing/explosions that occurred in 1993. (You'll note that even we do not have the answer. That increases our focus on your methods.)
You should gain an understanding of the attacks (focus on attacktype1), their distribution across the world, and their frequency.
Terror attacks are a ripe area of research for Bayesian inference. Given their infrequency, it is (thankfully) difficult for us to assume a high number of samples that approach some normal distribution.
Because of this, we should construct a prior about the amount of terror a given area has seen and update that prior with new information (like a new year of attacks or a contrasting country from within the same region).
You should structure your own test of populations rather than using the above example. If you're unable to setup a different test, brainstorm with your squad in the Slack chat.
You must justify the prior you selected and interpret your results (use credible intervals.) Remember you can attempt to use different priors (but don't "prior hack" to affect your output!)
The year 1993 is missing from our dataset! Given there is a wealth of information across different types of attacks, we will focus analysis on attacktype1 bombings (category 3, as per the codebook)