Predicting Terrorism / Bayesian Inference

About the Dataset

From Kaggle:

The Global Terrorism Database (GTD) is an open-source database including information on terrorist attacks around the world from 1970 through 2015 (with annual updates planned for the future). The GTD includes systematic data on domestic as well as international terrorist incidents that have occurred during this time period and now includes more than 150,000 cases. The database is maintained by researchers at the National Consortium for the Study of Terrorism and Responses to Terrorism (START), headquartered at the University of Maryland.

Geography: Worldwide

Time period: 1970-2015, except 1993 (2016 in progress, publication expected June 2017)

Unit of analysis: Attack

Variables: >100 variables on location, tactics, perpetrators, targets, and outcomes

Sources: Unclassified media articles (Note: Please interpret changes over time with caution. Global patterns are driven by diverse trends in particular regions, and data collection is influenced by fluctuations in access to media coverage over both time and place.)

Definition of terrorism:

"The threatened or actual use of illegal force and violence by a non-state actor to attain a political, economic, religious, or social goal through fear, coercion, or intimidation."

You will need to make use of the codebook that functions as a data dictionary.

Moreover, this 2007 paper by LaFree (UMD) introduces the dataset very well.

You must download the 28MB zip file, unzip it, and work with the 128MB csv on your machine.

Goal

In the process of maintaining the GTD, you'll note that the year 1993 has zero recorded instances. Due to the many actors maintaining the dataset, this year has been lost.

Your goal is to impute the number of bombing/explosions that occurred in 1993. (You'll note that even we do not have the answer. That increases our focus on your methods.)

Part One: EDA

You should gain an understanding of the attacks (focus on attacktype1), their distribution across the world, and their frequency.

Part Two: Bayesian Inference

Terror attacks are a ripe area of research for Bayesian inference. Given their infrequency, it is (thankfully) difficult for us to assume a high number of samples that approach some normal distribution.

Because of this, we should construct a prior about the amount of terror a given area has seen and update that prior with new information (like a new year of attacks or a contrasting country from within the same region).

You should structure your own test of populations rather than using the above example. If you're unable to setup a different test, brainstorm with your squad in the Slack chat.

You must justify the prior you selected and interpret your results (use credible intervals.) Remember you can attempt to use different priors (but don't "prior hack" to affect your output!)

Part Three: 1993

The year 1993 is missing from our dataset! Given there is a wealth of information across different types of attacks, we will focus analysis on attacktype1 bombings (category 3, as per the codebook)

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.DS_Store		.DS_Store
Final Conclusions.md		Final Conclusions.md
README.md		README.md
Solution Code.ipynb		Solution Code.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting Terrorism / Bayesian Inference

About the Dataset

Goal

Part One: EDA

Part Two: Bayesian Inference

Part Three: 1993

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Predicting Terrorism / Bayesian Inference

About the Dataset

Goal

Part One: EDA

Part Two: Bayesian Inference

Part Three: 1993

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages