price. Using this sample, what is your best point estimate of the population mean?\n",
+ "\n",
+ "\n",
+ "- Since you have access to the population, simulate the sampling distribution for the average home price in Ames by taking 5000 samples from the population of size 50 and computing 5000 sample means. Store these means in a vector called sample_means50. Plot the data, then describe the shape of this sampling distribution. Based on this sampling distribution, what would you guess the mean home price of the population to be? Finally, calculate and report the population mean.\n",
+ "\n",
+ "\n",
+ "- Change your sample size from 50 to 150, then compute the sampling distribution using the same method as above, and store these means in a new vector called sample_means150. Describe the shape of this sampling distribution, and compare it to the sampling distribution for a sample size of 50. Based on this sampling distribution, what would you guess to be the mean sale price of homes in Ames?\n",
+ "\n",
+ "\n",
+ "- Of the sampling distributions from 2 and 3, which has a smaller spread? If we’re concerned with making estimates that are more often close to the true value, would we prefer a distribution with a large or small spread?\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "import numpy as np\n",
+ "import matplotlib.pyplot as plt\n",
+ "%matplotlib inline"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%config InlineBackend.figure_format = 'retina'\n",
+ "plt.style.use(\"seaborn\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "df = pd.read_csv('ames.csv')\n",
+ "price = df['SalePrice']"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Question 1"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "187271.76"
+ ]
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "price.sample(50 , random_state = 42).mean()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Question 2"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sample_mean50 = []\n",
+ "\n",
+ "for i in range(5000):\n",
+ " sample_price = np.array(list(price.sample(50).values))\n",
+ " sample_mean50.append(sample_price.mean())\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "image/png": "\n",
+ "text/plain": [
+ "| \n", + " | A | \n", + "B | \n", + "C | \n", + "D | \n", + "E | \n", + "F | \n", + "G | \n", + "H | \n", + "I | \n", + "J | \n", + "
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", + "0.305239 | \n", + "0.231272 | \n", + "0.088222 | \n", + "0.251882 | \n", + "0.385132 | \n", + "0.234758 | \n", + "0.645410 | \n", + "0.986175 | \n", + "0.760495 | \n", + "0.491813 | \n", + "
| 1 | \n", + "0.894179 | \n", + "0.129102 | \n", + "0.520792 | \n", + "0.780366 | \n", + "0.785802 | \n", + "0.368216 | \n", + "0.207315 | \n", + "0.248793 | \n", + "0.587673 | \n", + "0.763330 | \n", + "
| 2 | \n", + "0.306956 | \n", + "0.049282 | \n", + "0.675463 | \n", + "0.113234 | \n", + "0.869604 | \n", + "0.930263 | \n", + "0.696827 | \n", + "0.397656 | \n", + "0.459233 | \n", + "0.064965 | \n", + "
| 3 | \n", + "0.272259 | \n", + "0.779780 | \n", + "0.380540 | \n", + "0.685579 | \n", + "0.785072 | \n", + "0.976008 | \n", + "0.412942 | \n", + "0.512427 | \n", + "0.956594 | \n", + "0.887955 | \n", + "
| 4 | \n", + "0.622153 | \n", + "0.963447 | \n", + "0.755817 | \n", + "0.541017 | \n", + "0.254235 | \n", + "0.708141 | \n", + "0.150924 | \n", + "0.664176 | \n", + "0.261850 | \n", + "0.161177 | \n", + "
| 5 | \n", + "0.811146 | \n", + "0.476762 | \n", + "0.855972 | \n", + "0.305416 | \n", + "0.808352 | \n", + "0.544968 | \n", + "0.843353 | \n", + "0.562328 | \n", + "0.826649 | \n", + "0.036099 | \n", + "
| 6 | \n", + "0.523588 | \n", + "0.727435 | \n", + "0.659603 | \n", + "0.762771 | \n", + "0.462791 | \n", + "0.180671 | \n", + "0.026311 | \n", + "0.011348 | \n", + "0.820640 | \n", + "0.175180 | \n", + "
| 7 | \n", + "0.290624 | \n", + "0.776585 | \n", + "0.648164 | \n", + "0.913340 | \n", + "0.240606 | \n", + "0.354553 | \n", + "0.271786 | \n", + "0.248880 | \n", + "0.084472 | \n", + "0.741984 | \n", + "
| 8 | \n", + "0.526691 | \n", + "0.020449 | \n", + "0.372602 | \n", + "0.057064 | \n", + "0.331893 | \n", + "0.809467 | \n", + "0.815766 | \n", + "0.421695 | \n", + "0.876327 | \n", + "0.676441 | \n", + "
| 9 | \n", + "0.469389 | \n", + "0.373859 | \n", + "0.917139 | \n", + "0.301419 | \n", + "0.361257 | \n", + "0.166470 | \n", + "0.001263 | \n", + "0.745675 | \n", + "0.487949 | \n", + "0.341857 | \n", + "
| 10 | \n", + "0.586595 | \n", + "0.103862 | \n", + "0.542455 | \n", + "0.172138 | \n", + "0.231164 | \n", + "0.621282 | \n", + "0.303060 | \n", + "0.199532 | \n", + "0.592424 | \n", + "0.846125 | \n", + "
| 11 | \n", + "0.874637 | \n", + "0.765251 | \n", + "0.446922 | \n", + "0.863143 | \n", + "0.934134 | \n", + "0.781621 | \n", + "0.578874 | \n", + "0.161245 | \n", + "0.364697 | \n", + "0.604686 | \n", + "
| 12 | \n", + "0.372582 | \n", + "0.014192 | \n", + "0.141197 | \n", + "0.718072 | \n", + "0.985635 | \n", + "0.727378 | \n", + "0.832513 | \n", + "0.394161 | \n", + "0.892131 | \n", + "0.595044 | \n", + "
| 13 | \n", + "0.747853 | \n", + "0.361451 | \n", + "0.307679 | \n", + "0.430487 | \n", + "0.333176 | \n", + "0.512498 | \n", + "0.421828 | \n", + "0.602016 | \n", + "0.756884 | \n", + "0.319003 | \n", + "
| 14 | \n", + "0.806012 | \n", + "0.289871 | \n", + "0.438636 | \n", + "0.135046 | \n", + "0.456787 | \n", + "0.491738 | \n", + "0.091800 | \n", + "0.136814 | \n", + "0.859572 | \n", + "0.175451 | \n", + "
| 15 | \n", + "0.633796 | \n", + "0.121392 | \n", + "0.551617 | \n", + "0.938901 | \n", + "0.037063 | \n", + "0.551676 | \n", + "0.989032 | \n", + "0.429235 | \n", + "0.139587 | \n", + "0.049697 | \n", + "
| 16 | \n", + "0.369000 | \n", + "0.534016 | \n", + "0.841040 | \n", + "0.079688 | \n", + "0.757663 | \n", + "0.511047 | \n", + "0.485882 | \n", + "0.118199 | \n", + "0.556479 | \n", + "0.377192 | \n", + "
| 17 | \n", + "0.922850 | \n", + "0.980879 | \n", + "0.299093 | \n", + "0.918864 | \n", + "0.870944 | \n", + "0.175901 | \n", + "0.749300 | \n", + "0.169334 | \n", + "0.537206 | \n", + "0.577845 | \n", + "
| 18 | \n", + "0.585918 | \n", + "0.606926 | \n", + "0.210228 | \n", + "0.821542 | \n", + "0.206155 | \n", + "0.557342 | \n", + "0.548108 | \n", + "0.521981 | \n", + "0.054173 | \n", + "0.617419 | \n", + "
| 19 | \n", + "0.042375 | \n", + "0.285725 | \n", + "0.220813 | \n", + "0.964579 | \n", + "0.865262 | \n", + "0.464721 | \n", + "0.220354 | \n", + "0.071457 | \n", + "0.548230 | \n", + "0.684320 | \n", + "
| \n", + " | Name | \n", + "Sex | \n", + "Job | \n", + "Job Status | \n", + "Age | \n", + "weight | \n", + "Account | \n", + "Transfer | \n", + "Count | \n", + "Religion | \n", + "
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", + "Gbenga | \n", + "male | \n", + "Teacher | \n", + "yes | \n", + "20 | \n", + "0.637354 | \n", + "Yes | \n", + "0 | \n", + "0 | \n", + "Christian | \n", + "
| 1 | \n", + "Femi | \n", + "female | \n", + "Banker | \n", + "yes | \n", + "24 | \n", + "0.607694 | \n", + "No | \n", + "1 | \n", + "1 | \n", + "Muslim | \n", + "
| 2 | \n", + "Ade | \n", + "male | \n", + "Footballer | \n", + "yes | \n", + "28 | \n", + "0.040372 | \n", + "Yes | \n", + "0 | \n", + "2 | \n", + "Christian | \n", + "
| 3 | \n", + "Bose | \n", + "female | \n", + "Trader | \n", + "yes | \n", + "32 | \n", + "0.999901 | \n", + "No | \n", + "1 | \n", + "3 | \n", + "Muslim | \n", + "
| 4 | \n", + "Bolu | \n", + "male | \n", + "Teacher | \n", + "yes | \n", + "36 | \n", + "0.239623 | \n", + "Yes | \n", + "0 | \n", + "4 | \n", + "Christian | \n", + "
| 5 | \n", + "Ayo | \n", + "female | \n", + "Banker | \n", + "yes | \n", + "40 | \n", + "0.571719 | \n", + "No | \n", + "1 | \n", + "5 | \n", + "Muslim | \n", + "
| 6 | \n", + "David | \n", + "male | \n", + "Footballer | \n", + "yes | \n", + "44 | \n", + "0.251429 | \n", + "Yes | \n", + "0 | \n", + "6 | \n", + "Christian | \n", + "
| 7 | \n", + "Esther | \n", + "female | \n", + "Trader | \n", + "yes | \n", + "48 | \n", + "0.307864 | \n", + "No | \n", + "1 | \n", + "7 | \n", + "Muslim | \n", + "
| 8 | \n", + "Ifeoma | \n", + "male | \n", + "Teacher | \n", + "yes | \n", + "52 | \n", + "0.714834 | \n", + "Yes | \n", + "0 | \n", + "8 | \n", + "Christian | \n", + "
| 9 | \n", + "Akeem | \n", + "female | \n", + "Banker | \n", + "yes | \n", + "56 | \n", + "0.409580 | \n", + "No | \n", + "1 | \n", + "9 | \n", + "Muslim | \n", + "
| 10 | \n", + "Wale | \n", + "male | \n", + "Footballer | \n", + "yes | \n", + "60 | \n", + "0.338789 | \n", + "Yes | \n", + "0 | \n", + "10 | \n", + "Christian | \n", + "
| 11 | \n", + "Abeeb | \n", + "female | \n", + "Trader | \n", + "yes | \n", + "64 | \n", + "0.460655 | \n", + "No | \n", + "1 | \n", + "11 | \n", + "Muslim | \n", + "
| 12 | \n", + "Muraina | \n", + "male | \n", + "Teacher | \n", + "yes | \n", + "68 | \n", + "0.893463 | \n", + "Yes | \n", + "0 | \n", + "12 | \n", + "Christian | \n", + "
| 13 | \n", + "Segun | \n", + "female | \n", + "Banker | \n", + "yes | \n", + "72 | \n", + "0.750401 | \n", + "No | \n", + "1 | \n", + "13 | \n", + "Muslim | \n", + "
| 14 | \n", + "Tosin | \n", + "male | \n", + "Footballer | \n", + "yes | \n", + "76 | \n", + "0.150023 | \n", + "Yes | \n", + "0 | \n", + "14 | \n", + "Christian | \n", + "
| 15 | \n", + "Siju | \n", + "female | \n", + "Trader | \n", + "yes | \n", + "80 | \n", + "0.503903 | \n", + "No | \n", + "1 | \n", + "15 | \n", + "Muslim | \n", + "
| 16 | \n", + "Deji | \n", + "male | \n", + "Teacher | \n", + "yes | \n", + "84 | \n", + "0.853731 | \n", + "Yes | \n", + "0 | \n", + "16 | \n", + "Christian | \n", + "
| 17 | \n", + "Blessing | \n", + "female | \n", + "Banker | \n", + "yes | \n", + "88 | \n", + "0.880703 | \n", + "No | \n", + "1 | \n", + "17 | \n", + "Muslim | \n", + "
| 18 | \n", + "Funmi | \n", + "male | \n", + "Footballer | \n", + "yes | \n", + "92 | \n", + "0.840192 | \n", + "Yes | \n", + "0 | \n", + "18 | \n", + "Christian | \n", + "
| 19 | \n", + "Yinka | \n", + "female | \n", + "Trader | \n", + "yes | \n", + "96 | \n", + "0.922895 | \n", + "No | \n", + "1 | \n", + "19 | \n", + "Muslim | \n", + "
| \n", + " | A | \n", + "B | \n", + "C | \n", + "D | \n", + "E | \n", + "F | \n", + "G | \n", + "H | \n", + "I | \n", + "J | \n", + "
|---|---|---|---|---|---|---|---|---|---|---|
| count | \n", + "20.000000 | \n", + "20.000000 | \n", + "20.000000 | \n", + "20.000000 | \n", + "20.000000 | \n", + "20.000000 | \n", + "20.000000 | \n", + "20.000000 | \n", + "20.000000 | \n", + "20.000000 | \n", + "
| mean | \n", + "0.548192 | \n", + "0.429577 | \n", + "0.493700 | \n", + "0.537727 | \n", + "0.548136 | \n", + "0.533436 | \n", + "0.464633 | \n", + "0.380156 | \n", + "0.571163 | \n", + "0.459379 | \n", + "
| std | \n", + "0.245335 | \n", + "0.319636 | \n", + "0.244889 | \n", + "0.332432 | \n", + "0.299334 | \n", + "0.241799 | \n", + "0.299394 | \n", + "0.253898 | \n", + "0.278767 | \n", + "0.278714 | \n", + "
| min | \n", + "0.042375 | \n", + "0.014192 | \n", + "0.088222 | \n", + "0.057064 | \n", + "0.037063 | \n", + "0.166470 | \n", + "0.001263 | \n", + "0.011348 | \n", + "0.054173 | \n", + "0.036099 | \n", + "
| 25% | \n", + "0.353489 | \n", + "0.127174 | \n", + "0.305532 | \n", + "0.231946 | \n", + "0.312478 | \n", + "0.364800 | \n", + "0.217095 | \n", + "0.167312 | \n", + "0.435599 | \n", + "0.175383 | \n", + "
| 50% | \n", + "0.556304 | \n", + "0.367655 | \n", + "0.483857 | \n", + "0.613298 | \n", + "0.459789 | \n", + "0.528733 | \n", + "0.453855 | \n", + "0.395908 | \n", + "0.572076 | \n", + "0.534829 | \n", + "
| 75% | \n", + "0.762393 | \n", + "0.736889 | \n", + "0.663568 | \n", + "0.831942 | \n", + "0.822579 | \n", + "0.712950 | \n", + "0.709945 | \n", + "0.532068 | \n", + "0.822142 | \n", + "0.678411 | \n", + "
| max | \n", + "0.922850 | \n", + "0.980879 | \n", + "0.917139 | \n", + "0.964579 | \n", + "0.985635 | \n", + "0.976008 | \n", + "0.989032 | \n", + "0.986175 | \n", + "0.956594 | \n", + "0.887955 | \n", + "
| \n", + " | Age | \n", + "weight | \n", + "Transfer | \n", + "Count | \n", + "
|---|---|---|---|---|
| count | \n", + "20.000000 | \n", + "20.000000 | \n", + "20.000000 | \n", + "20.00000 | \n", + "
| mean | \n", + "58.000000 | \n", + "0.568756 | \n", + "0.500000 | \n", + "9.50000 | \n", + "
| std | \n", + "23.664319 | \n", + "0.286102 | \n", + "0.512989 | \n", + "5.91608 | \n", + "
| min | \n", + "20.000000 | \n", + "0.040372 | \n", + "0.000000 | \n", + "0.00000 | \n", + "
| 25% | \n", + "39.000000 | \n", + "0.331058 | \n", + "0.000000 | \n", + "4.75000 | \n", + "
| 50% | \n", + "58.000000 | \n", + "0.589706 | \n", + "0.500000 | \n", + "9.50000 | \n", + "
| 75% | \n", + "77.000000 | \n", + "0.843577 | \n", + "1.000000 | \n", + "14.25000 | \n", + "
| max | \n", + "96.000000 | \n", + "0.999901 | \n", + "1.000000 | \n", + "19.00000 | \n", + "
\n",
+ "\n",
+ "Fill in the app creation page with a unique name, a website name (use a placeholder website if you don’t have one), and a project description. Accept the terms and conditions and proceed to the next page.\n",
+ "\n",
+ "Once your project has been created, click on the “Keys and Access Tokens” tab. You should now be able to see your consumer secret and consumer key.\n",
+ "\n",
+ "
\n",
+ "\n",
+ "You’ll also need a pair of access tokens. Scroll down and request those tokens. The page should refresh, and you should now have an access token and access token secret.\n",
+ "\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Import necessary modules"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 81,
+ "metadata": {
+ "colab": {},
+ "colab_type": "code",
+ "id": "UI2fIQFxrNLB",
+ "run_control": {
+ "frozen": false,
+ "read_only": false
+ }
+ },
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "import os\n",
+ "import json\n",
+ "import re\n",
+ "import string\n",
+ "import pandas as pd\n",
+ "from pandas.io.json import json_normalize\n",
+ "import matplotlib.pyplot as plt"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {
+ "colab": {},
+ "colab_type": "code",
+ "id": "3pf5Xapqrq0M"
+ },
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "[nltk_data] Error loading punkt: | \n", + " | id | \n", + "id_str | \n", + "name | \n", + "screen_name | \n", + "location | \n", + "description | \n", + "url | \n", + "protected | \n", + "followers_count | \n", + "friends_count | \n", + "... | \n", + "status.place.place_type | \n", + "status.place.name | \n", + "status.place.full_name | \n", + "status.place.country_code | \n", + "status.place.country | \n", + "status.place.contained_within | \n", + "status.place.bounding_box.type | \n", + "status.place.bounding_box.coordinates | \n", + "status.retweeted_status.entities.media | \n", + "status.retweeted_status.extended_entities.media | \n", + "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", + "305125998 | \n", + "305125998 | \n", + "Jeffrey Gettleman | \n", + "gettleman | \n", + "New Delhi, India | \n", + "South Asia bureau chief for the New York Times... | \n", + "http://t.co/AYD1lbjVvB | \n", + "False | \n", + "25704 | \n", + "37 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 1 | \n", + "26475943 | \n", + "26475943 | \n", + "A24 Media | \n", + "a24media | \n", + "Golden Ivy Plaza, Karen, NBO | \n", + "Africa 24 produces compelling content that mak... | \n", + "https://t.co/5I7guDadfM | \n", + "False | \n", + "31285 | \n", + "3059 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 2 | \n", + "72013267 | \n", + "72013267 | \n", + "Scapegoat | \n", + "AndiMakinana | \n", + "Cape Town, South Africa | \n", + "In pursuit of scoops. I do not write headlines... | \n", + "https://t.co/pQLpRj9WO4 | \n", + "False | \n", + "101278 | \n", + "2838 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 3 | \n", + "625489039 | \n", + "625489039 | \n", + "Africa Check | \n", + "AfricaCheck | \n", + "\n", + " | Africa's first independent fact-checking websi... | \n", + "https://t.co/8bYLuvxpVN | \n", + "False | \n", + "68074 | \n", + "4592 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 4 | \n", + "401520924 | \n", + "401520924 | \n", + "James Copnall | \n", + "JamesCopnall | \n", + "\n", + " | BBC reporter + presenter. Author A Poisonous T... | \n", + "http://t.co/xrztQ2mzfH | \n", + "False | \n", + "21963 | \n", + "5050 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| ... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "
| 88 | \n", + "117102398 | \n", + "117102398 | \n", + "Julius Sello Malema | \n", + "Julius_S_Malema | \n", + "Johannesburg, South Africa | \n", + "Commander in Chief of Economic Freedom Fighter... | \n", + "https://t.co/MrsRL5oNpK | \n", + "False | \n", + "3129686 | \n", + "652 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 89 | \n", + "14697575 | \n", + "14697575 | \n", + "News24 | \n", + "News24 | \n", + "South Africa | \n", + "South Africa's premier online news resource. F... | \n", + "https://t.co/TV9HgXREOi | \n", + "False | \n", + "3577999 | \n", + "631 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 90 | \n", + "1102508781781557248 | \n", + "1102508781781557248 | \n", + "jdwtweet | \n", + "SAPresident | \n", + "Miami, FL | \n", + "\n", + " | None | \n", + "False | \n", + "18 | \n", + "14 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 91 | \n", + "17962204 | \n", + "17962204 | \n", + "Gareth Cliff | \n", + "GarethCliff | \n", + "South Africa | \n", + "President of https://t.co/scMZ7lsVKF ⚜. Enquir... | \n", + "https://t.co/99Q8vPRprW | \n", + "False | \n", + "1974613 | \n", + "356 | \n", + "... | \n", + "city | \n", + "Pretoria | \n", + "Pretoria, South Africa | \n", + "ZA | \n", + "South Africa | \n", + "[] | \n", + "Polygon | \n", + "[[[27.9483035, -25.9157727], [28.4198285, -25.... | \n", + "NaN | \n", + "NaN | \n", + "
| 92 | \n", + "46335511 | \n", + "46335511 | \n", + "Trevor Noah | \n", + "Trevornoah | \n", + "New York, NY | \n", + "Comedian from South Africa. I was in the crowd... | \n", + "https://t.co/5zKr0YPMa5 | \n", + "False | \n", + "10808461 | \n", + "325 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
93 rows × 123 columns
\n", + "| \n", + " | id | \n", + "id_str | \n", + "name | \n", + "screen_name | \n", + "location | \n", + "profile_location | \n", + "description | \n", + "url | \n", + "protected | \n", + "followers_count | \n", + "... | \n", + "profile_location.country_code | \n", + "profile_location.country | \n", + "profile_location.contained_within | \n", + "profile_location.bounding_box | \n", + "status.entities.media | \n", + "status.extended_entities.media | \n", + "status.retweeted_status.quoted_status_id | \n", + "status.retweeted_status.quoted_status_id_str | \n", + "status.quoted_status_id | \n", + "status.quoted_status_id_str | \n", + "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", + "2515899612 | \n", + "2515899612 | \n", + "Hage G. Geingob | \n", + "hagegeingob | \n", + "Namibia | \n", + "NaN | \n", + "President of the Republic of Namibia | \n", + "https://t.co/f5BbkeEYSL | \n", + "False | \n", + "192470 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 1 | \n", + "40839292 | \n", + "40839292 | \n", + "Presidency | South Africa 🇿🇦 | \n", + "PresidencyZA | \n", + "Pretoria, South Africa | \n", + "NaN | \n", + "This is the official Twitter page of The Presi... | \n", + "https://t.co/lw3QfCqSCq | \n", + "False | \n", + "1599341 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 2 | \n", + "1200316338 | \n", + "1200316338 | \n", + "Ministry of Health Zambia | \n", + "mohzambia | \n", + "Lusaka, Zambia | \n", + "NaN | \n", + "The Ministry aims to address and share ideas w... | \n", + "https://t.co/ShAx7bUDqc | \n", + "False | \n", + "7170 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 3 | \n", + "447895686 | \n", + "447895686 | \n", + "President of Zimbabwe | \n", + "edmnangagwa | \n", + "Zimbabwe | \n", + "NaN | \n", + "Official Twitter account of Emmerson Dambudzo ... | \n", + "None | \n", + "False | \n", + "546537 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 4 | \n", + "894266976499060736 | \n", + "894266976499060736 | \n", + "MinSantédj | \n", + "MinSantedj | \n", + "Djibouti | \n", + "NaN | \n", + "ORGANISME GOUVERNEMENTAL\\nSuivez toutes les ac... | \n", + "https://t.co/ZElb6lvXnU | \n", + "False | \n", + "2934 | \n", + "... | \n", + "\n", + " | \n", + " | [] | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 5 | \n", + "438370063 | \n", + "438370063 | \n", + "Yemane G. Meskel | \n", + "hawelti | \n", + "Asmara; ERITREA | \n", + "NaN | \n", + "Minister of Information | \n", + "https://t.co/fSQjSLmk6t | \n", + "False | \n", + "66245 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 6 | \n", + "364830542 | \n", + "364830542 | \n", + "State House Kenya | \n", + "StateHouseKenya | \n", + "Nairobi | \n", + "NaN | \n", + "\n", + " | https://t.co/vReQnpRV2z | \n", + "False | \n", + "1104077 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 7 | \n", + "37601149 | \n", + "37601149 | \n", + "Paul Kagame | \n", + "PaulKagame | \n", + "Rwanda, Africa | \n", + "NaN | \n", + "President of the Republic of Rwanda, write to:... | \n", + "https://t.co/bfKOFZyOav | \n", + "False | \n", + "1984272 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 8 | \n", + "812627249446780928 | \n", + "812627249446780928 | \n", + "Mohamed Farmaajo | \n", + "M_Farmaajo | \n", + "Somalia | \n", + "NaN | \n", + "9th and the current President of Federal Repub... | \n", + "None | \n", + "False | \n", + "424425 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 9 | \n", + "868153335307698177 | \n", + "868153335307698177 | \n", + "South Sudan Government | \n", + "SouthSudanGov | \n", + "South Sudan | \n", + "NaN | \n", + "Official Twitter Account of the Revitalized Tr... | \n", + "None | \n", + "False | \n", + "2559 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 10 | \n", + "1164801318742982656 | \n", + "1164801318742982656 | \n", + "Abdalla Hamdok | \n", + "SudanPMHamdok | \n", + "Sudan | \n", + "NaN | \n", + "The official account of the Prime Minister of ... | \n", + "https://t.co/2dtCXDjTvv | \n", + "False | \n", + "371069 | \n", + "... | \n", + "\n", + " | \n", + " | [] | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 11 | \n", + "976523578966466561 | \n", + "976523578966466561 | \n", + "TanzaniaSpokesperson | \n", + "TZSpokesperson | \n", + "Dodoma, Tanzania | \n", + "NaN | \n", + "Official English Account of the Chief Spokespe... | \n", + "https://t.co/o6LugkYaH3 | \n", + "False | \n", + "4024 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 12 | \n", + "126955629 | \n", + "126955629 | \n", + "Yoweri K Museveni | \n", + "KagutaMuseveni | \n", + "Uganda | \n", + "NaN | \n", + "President of the Republic of Uganda | \n", + "https://t.co/98sFzWcbAF | \n", + "False | \n", + "1813693 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 13 | \n", + "960107639479906304 | \n", + "960107639479906304 | \n", + "MOFA/MRE -(Angola) | \n", + "angola_Mirex | \n", + "Angola | \n", + "NaN | \n", + "Conta oficial do Twitter do Ministério das Rel... | \n", + "https://t.co/EZEdzQzwGE | \n", + "False | \n", + "3272 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 14 | \n", + "337183326 | \n", + "337183326 | \n", + "Amb. Willy Nyamitwe | \n", + "willynyamitwe | \n", + "Burundi, Bujumbura | \n", + "NaN | \n", + "Ambassador & Senior Advisor to HE @GeneralNeva... | \n", + "https://t.co/IqiH1MwnKs | \n", + "False | \n", + "107409 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 15 | \n", + "2216872019 | \n", + "2216872019 | \n", + "Chérif Mahamat Zene | \n", + "Cherif_MZ | \n", + "Tchad | \n", + "NaN | \n", + "Ministre de la Communication, Porte-parole du ... | \n", + "https://t.co/D2MAAPTVot | \n", + "False | \n", + "18635 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 16 | \n", + "817736921027854336 | \n", + "817736921027854336 | \n", + "Présidence RDC 🇨🇩 | \n", + "Presidence_RDC | \n", + "Kinshasa, Rép. Dém du Congo | \n", + "NaN | \n", + "Bienvenue sur le compte officiel de la Préside... | \n", + "https://t.co/uISGEjLi7o | \n", + "False | \n", + "340934 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 17 | \n", + "3013693201 | \n", + "3013693201 | \n", + "Ali Bongo Ondimba | \n", + "PresidentABO | \n", + "Gabon | \n", + "NaN | \n", + "Président de la République Gabonaise | \n", + "https://t.co/HyRWdJhnrg | \n", + "False | \n", + "173601 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 18 | \n", + "2853870821 | \n", + "2853870821 | \n", + "Présidence du Bénin | \n", + "PresidenceBenin | \n", + "République du Bénin | \n", + "NaN | \n", + "Compte officiel de la Présidence de la Républi... | \n", + "https://t.co/VqAMM9C6kY | \n", + "False | \n", + "30165 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 19 | \n", + "3353383450 | \n", + "3353383450 | \n", + "Roch KABORE | \n", + "rochkaborepf | \n", + "Burkina Faso | \n", + "NaN | \n", + "Président du Faso | \n", + "https://t.co/47yRKu3CMx | \n", + "False | \n", + "256351 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 20 | \n", + "580037845 | \n", + "580037845 | \n", + "Presidente Cabo Verde | \n", + "PresidenciaCV | \n", + "Cape Verde | \n", + "NaN | \n", + "Bio: President of the Republic of Cabo Verde.\\... | \n", + "http://t.co/zrtxLrXOSg | \n", + "False | \n", + "3283 | \n", + "... | \n", + "\n", + " | \n", + " | [] | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 21 | \n", + "86037380 | \n", + "86037380 | \n", + "Alassane Ouattara | \n", + "AOuattara_PRCI | \n", + "Ivory Coast | \n", + "NaN | \n", + "Profil officiel d’Alassane Ouattara, Président... | \n", + "https://t.co/T70r91bUyq | \n", + "False | \n", + "832898 | \n", + "... | \n", + "\n", + " | \n", + " | [] | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 22 | \n", + "998585232143110144 | \n", + "998585232143110144 | \n", + "State House of The Gambia | \n", + "Presidency_GMB | \n", + "Gambia | \n", + "NaN | \n", + "Official Twitter for the Office of The Preside... | \n", + "https://t.co/vnlfhzchiT | \n", + "False | \n", + "10772 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 23 | \n", + "247217109 | \n", + "247217109 | \n", + "Nana Akufo-Addo | \n", + "NAkufoAddo | \n", + "Ghana | \n", + "NaN | \n", + "Official Twitter account of Nana Addo Dankwa A... | \n", + "https://t.co/Vz5z6hVzYV | \n", + "False | \n", + "1508420 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "[{'id': 1284815212181434369, 'id_str': '128481... | \n", + "[{'id': 1284815212181434369, 'id_str': '128481... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 24 | \n", + "1207235998406651904 | \n", + "1207235998406651904 | \n", + "Alpha CONDÉ | \n", + "president_gn | \n", + "Guinea | \n", + "NaN | \n", + "Président de la République de Guinée -- Presid... | \n", + "https://t.co/lFypX5PDgC | \n", + "False | \n", + "736 | \n", + "... | \n", + "\n", + " | \n", + " | [] | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "1.258056e+18 | \n", + "1258056308068139010 | \n", + "1.258056e+18 | \n", + "1258056308068139010 | \n", + "
| 25 | \n", + "732872209480491008 | \n", + "732872209480491008 | \n", + "Umaro Sissoco Embalo | \n", + "USEmbalo | \n", + "Guinea Bissau | \n", + "NaN | \n", + "President of the Republic of Guinea-Bissau 🇬🇼\\... | \n", + "https://t.co/fmjH8m2fpB | \n", + "False | \n", + "8482 | \n", + "... | \n", + "\n", + " | \n", + " | [] | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 26 | \n", + "389486048 | \n", + "389486048 | \n", + "Presidence Mali | \n", + "PresidenceMali | \n", + "Bamako | \n", + "NaN | \n", + "Fil officiel de la Présidence de la République... | \n", + "https://t.co/4w4yX7r4qN | \n", + "False | \n", + "228959 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "[{'id': 1284214657348247559, 'id_str': '128421... | \n", + "[{'id': 1284214657348247559, 'id_str': '128421... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 27 | \n", + "1207767213844914176 | \n", + "1207767213844914176 | \n", + "Mohamed Cheikh El Ghazouani محمد ولدالشيخ الغز... | \n", + "CheikhGhazouani | \n", + "Nouakchott | \n", + "NaN | \n", + "رئيس الجمهورية الإسلامية الموريتانية Président... | \n", + "https://t.co/yODYwZTv04 | \n", + "False | \n", + "31805 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 28 | \n", + "4821435801 | \n", + "4821435801 | \n", + "Issoufou Mahamadou | \n", + "IssoufouMhm | \n", + "Niger | \n", + "NaN | \n", + "Compte officiel d'Issoufou Mahamadou, Présiden... | \n", + "https://t.co/yqT76kKsIq | \n", + "False | \n", + "116302 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 29 | \n", + "2936714848 | \n", + "2936714848 | \n", + "Muhammadu Buhari | \n", + "MBuhari | \n", + "\n", + " | NaN | \n", + "This is the official account of Muhammadu Buha... | \n", + "None | \n", + "False | \n", + "3272237 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 30 | \n", + "197493438 | \n", + "197493438 | \n", + "Macky Sall | \n", + "Macky_Sall | \n", + "Sénégal | \n", + "NaN | \n", + "Président de la République du Sénégal 🇸🇳 | \n", + "https://t.co/FK2Eo1gPuQ | \n", + "False | \n", + "1375730 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 31 | \n", + "983228767761027072 | \n", + "983228767761027072 | \n", + "President Julius Maada Bio | \n", + "PresidentBio | \n", + "Sierra Leone | \n", + "NaN | \n", + "H.E. Julius Maada Wonie Bio was inaugurated as... | \n", + "https://t.co/KHVHrCBNYA | \n", + "False | \n", + "14907 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
| 32 | \n", + "4920180340 | \n", + "4920180340 | \n", + "Ministère de la Santé et de l'hygiène Publique | \n", + "MSPS_Togo | \n", + "Lomé | \n", + "NaN | \n", + "\n", + " | https://t.co/sYvXay8AhT | \n", + "False | \n", + "1132 | \n", + "... | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "NaN | \n", + "
33 rows × 121 columns
\n", + "| \n", + " | screen_name | \n", + "followers_count | \n", + "friends_count | \n", + "statuses_count | \n", + "favourites_count | \n", + "Popularity_Score | \n", + "Reach_Score | \n", + "
|---|---|---|---|---|---|---|---|
| 14 | \n", + "willynyamitwe | \n", + "107409 | \n", + "4656 | \n", + "40682 | \n", + "6259 | \n", + "46941 | \n", + "102753 | \n", + "
| 1 | \n", + "PresidencyZA | \n", + "1599341 | \n", + "14 | \n", + "18881 | \n", + "63 | \n", + "18944 | \n", + "1599327 | \n", + "
| 26 | \n", + "PresidenceMali | \n", + "228959 | \n", + "1001 | \n", + "11168 | \n", + "1732 | \n", + "12900 | \n", + "227958 | \n", + "
| 6 | \n", + "StateHouseKenya | \n", + "1104077 | \n", + "214 | \n", + "9050 | \n", + "61 | \n", + "9111 | \n", + "1103863 | \n", + "
| 23 | \n", + "NAkufoAddo | \n", + "1508420 | \n", + "352 | \n", + "7098 | \n", + "134 | \n", + "7232 | \n", + "1508068 | \n", + "
| 21 | \n", + "AOuattara_PRCI | \n", + "832898 | \n", + "23 | \n", + "7139 | \n", + "4 | \n", + "7143 | \n", + "832875 | \n", + "
| 25 | \n", + "USEmbalo | \n", + "8482 | \n", + "181 | \n", + "828 | \n", + "6065 | \n", + "6893 | \n", + "8301 | \n", + "
| 12 | \n", + "KagutaMuseveni | \n", + "1813693 | \n", + "28 | \n", + "6645 | \n", + "76 | \n", + "6721 | \n", + "1813665 | \n", + "
| 18 | \n", + "PresidenceBenin | \n", + "30165 | \n", + "66 | \n", + "5649 | \n", + "94 | \n", + "5743 | \n", + "30099 | \n", + "
| 5 | \n", + "hawelti | \n", + "66245 | \n", + "435 | \n", + "4718 | \n", + "735 | \n", + "5453 | \n", + "65810 | \n", + "
| 19 | \n", + "rochkaborepf | \n", + "256351 | \n", + "151 | \n", + "4496 | \n", + "475 | \n", + "4971 | \n", + "256200 | \n", + "
| 29 | \n", + "MBuhari | \n", + "3272237 | \n", + "26 | \n", + "4734 | \n", + "8 | \n", + "4742 | \n", + "3272211 | \n", + "
| 7 | \n", + "PaulKagame | \n", + "1984272 | \n", + "181 | \n", + "2862 | \n", + "616 | \n", + "3478 | \n", + "1984091 | \n", + "
| 30 | \n", + "Macky_Sall | \n", + "1375730 | \n", + "171 | \n", + "2796 | \n", + "530 | \n", + "3326 | \n", + "1375559 | \n", + "
| 22 | \n", + "Presidency_GMB | \n", + "10772 | \n", + "27 | \n", + "1440 | \n", + "452 | \n", + "1892 | \n", + "10745 | \n", + "
| 17 | \n", + "PresidentABO | \n", + "173601 | \n", + "4 | \n", + "1738 | \n", + "16 | \n", + "1754 | \n", + "173597 | \n", + "
| 16 | \n", + "Presidence_RDC | \n", + "340934 | \n", + "125 | \n", + "1650 | \n", + "99 | \n", + "1749 | \n", + "340809 | \n", + "
| 4 | \n", + "MinSantedj | \n", + "2934 | \n", + "127 | \n", + "1065 | \n", + "587 | \n", + "1652 | \n", + "2807 | \n", + "
| 0 | \n", + "hagegeingob | \n", + "192470 | \n", + "55 | \n", + "1087 | \n", + "268 | \n", + "1355 | \n", + "192415 | \n", + "
| 15 | \n", + "Cherif_MZ | \n", + "18635 | \n", + "196 | \n", + "753 | \n", + "473 | \n", + "1226 | \n", + "18439 | \n", + "
| 13 | \n", + "angola_Mirex | \n", + "3272 | \n", + "312 | \n", + "732 | \n", + "447 | \n", + "1179 | \n", + "2960 | \n", + "
| 2 | \n", + "mohzambia | \n", + "7170 | \n", + "95 | \n", + "838 | \n", + "163 | \n", + "1001 | \n", + "7075 | \n", + "
| 11 | \n", + "TZSpokesperson | \n", + "4024 | \n", + "32 | \n", + "836 | \n", + "1 | \n", + "837 | \n", + "3992 | \n", + "
| 20 | \n", + "PresidenciaCV | \n", + "3283 | \n", + "885 | \n", + "715 | \n", + "91 | \n", + "806 | \n", + "2398 | \n", + "
| 10 | \n", + "SudanPMHamdok | \n", + "371069 | \n", + "115 | \n", + "652 | \n", + "43 | \n", + "695 | \n", + "370954 | \n", + "
| 3 | \n", + "edmnangagwa | \n", + "546537 | \n", + "116 | \n", + "628 | \n", + "65 | \n", + "693 | \n", + "546421 | \n", + "
| 8 | \n", + "M_Farmaajo | \n", + "424425 | \n", + "2 | \n", + "599 | \n", + "22 | \n", + "621 | \n", + "424423 | \n", + "
| 9 | \n", + "SouthSudanGov | \n", + "2559 | \n", + "463 | \n", + "209 | \n", + "348 | \n", + "557 | \n", + "2096 | \n", + "
| 28 | \n", + "IssoufouMhm | \n", + "116302 | \n", + "17 | \n", + "349 | \n", + "3 | \n", + "352 | \n", + "116285 | \n", + "
| 31 | \n", + "PresidentBio | \n", + "14907 | \n", + "0 | \n", + "83 | \n", + "0 | \n", + "83 | \n", + "14907 | \n", + "
| 32 | \n", + "MSPS_Togo | \n", + "1132 | \n", + "2 | \n", + "44 | \n", + "1 | \n", + "45 | \n", + "1130 | \n", + "
| 27 | \n", + "CheikhGhazouani | \n", + "31805 | \n", + "9 | \n", + "30 | \n", + "0 | \n", + "30 | \n", + "31796 | \n", + "
| 24 | \n", + "president_gn | \n", + "736 | \n", + "29 | \n", + "21 | \n", + "6 | \n", + "27 | \n", + "707 | \n", + "
| \n", + " | screen_name | \n", + "followers_count | \n", + "friends_count | \n", + "statuses_count | \n", + "favourites_count | \n", + "Popularity_Score | \n", + "Reach_Score | \n", + "
|---|---|---|---|---|---|---|---|
| 80 | \n", + "UlrichJvV | \n", + "1042471 | \n", + "530305 | \n", + "19998 | \n", + "347436 | \n", + "367434 | \n", + "512166 | \n", + "
| 89 | \n", + "News24 | \n", + "3577999 | \n", + "631 | \n", + "322904 | \n", + "1143 | \n", + "324047 | \n", + "3577368 | \n", + "
| 47 | \n", + "CityofJoburgZA | \n", + "1002758 | \n", + "61972 | \n", + "265302 | \n", + "33984 | \n", + "299286 | \n", + "940786 | \n", + "
| 30 | \n", + "africatechie | \n", + "106164 | \n", + "1644 | \n", + "101334 | \n", + "136604 | \n", + "237938 | \n", + "104520 | \n", + "
| 2 | \n", + "AndiMakinana | \n", + "101278 | \n", + "2838 | \n", + "142232 | \n", + "9028 | \n", + "151260 | \n", + "98440 | \n", + "
| ... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "... | \n", + "
| 38 | \n", + "africamedia_CPJ | \n", + "487 | \n", + "0 | \n", + "35 | \n", + "0 | \n", + "35 | \n", + "487 | \n", + "
| 78 | \n", + "BBCAndrewH | \n", + "7 | \n", + "0 | \n", + "6 | \n", + "0 | \n", + "6 | \n", + "7 | \n", + "
| 72 | \n", + "SmithInAfrica | \n", + "69 | \n", + "0 | \n", + "1 | \n", + "0 | \n", + "1 | \n", + "69 | \n", + "
| 7 | \n", + "stateafrica | \n", + "8 | \n", + "0 | \n", + "1 | \n", + "0 | \n", + "1 | \n", + "8 | \n", + "
| 26 | \n", + "ThisisAfrica | \n", + "6 | \n", + "0 | \n", + "0 | \n", + "0 | \n", + "0 | \n", + "6 | \n", + "
93 rows × 7 columns
\n", + "| \n", + " | GHO | \n", + "PUBLISHSTATE | \n", + "YEAR | \n", + "REGION | \n", + "COUNTRY | \n", + "AGEGROUP | \n", + "SEX | \n", + "Display Value | \n", + "Numeric | \n", + "MidAge | \n", + "
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", + "LIFE_0000000030 | \n", + "PUBLISHED | \n", + "2016 | \n", + "AFR | \n", + "NGA | \n", + "0-1 | \n", + "Male | \n", + "0.071 | \n", + "0.07128 | \n", + "-3 | \n", + "
| 1 | \n", + "LIFE_0000000030 | \n", + "PUBLISHED | \n", + "2016 | \n", + "SEAR | \n", + "IND | \n", + "0-1 | \n", + "Male | \n", + "0.034 | \n", + "0.03386 | \n", + "-3 | \n", + "
| 2 | \n", + "LIFE_0000000030 | \n", + "PUBLISHED | \n", + "2016 | \n", + "AFR | \n", + "NGA | \n", + "0-1 | \n", + "Female | \n", + "0.062 | \n", + "0.06243 | \n", + "-3 | \n", + "
| 3 | \n", + "LIFE_0000000030 | \n", + "PUBLISHED | \n", + "2016 | \n", + "SEAR | \n", + "IND | \n", + "0-1 | \n", + "Female | \n", + "0.038 | \n", + "0.03755 | \n", + "-3 | \n", + "
| 4 | \n", + "LIFE_0000000030 | \n", + "PUBLISHED | \n", + "2016 | \n", + "AFR | \n", + "NGA | \n", + "1-4 | \n", + "Male | \n", + "0.039 | \n", + "0.03914 | \n", + "2 | \n", + "
\n",
- "\n",
- "Fill in the app creation page with a unique name, a website name (use a placeholder website if you don’t have one), and a project description. Accept the terms and conditions and proceed to the next page.\n",
- "\n",
- "Once your project has been created, click on the “Keys and Access Tokens” tab. You should now be able to see your consumer secret and consumer key.\n",
- "\n",
- "
\n",
- "\n",
- "You’ll also need a pair of access tokens. Scroll down and request those tokens. The page should refresh, and you should now have an access token and access token secret.\n",
- "\n",
- "
\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Import necessary modules"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 32,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "UI2fIQFxrNLB",
- "run_control": {
- "frozen": false,
- "read_only": false
- }
- },
- "outputs": [],
- "source": [
- "import sys\n",
- "import os\n",
- "import json\n",
- "import pandas as pd\n",
- "import matplotlib.pyplot as plt"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "3pf5Xapqrq0M"
- },
- "outputs": [],
- "source": [
- "#Import the necessary methods from tweepy library \n",
- "\n",
- "#install tweepy if you don't have it\n",
- "#!pip install tweepy\n",
- "import tweepy\n",
- "from tweepy.streaming import StreamListener\n",
- "from tweepy import OAuthHandler\n",
- "from tweepy import Stream\n",
- "\n",
- "#sentiment analysis package\n",
- "#!pip install textblob\n",
- "from textblob import TextBlob\n",
- "\n",
- "#general text pre-processor\n",
- "#!pip install nltk\n",
- "import nltk\n",
- "from nltk.corpus import stopwords\n",
- "\n",
- "#tweet pre-processor \n",
- "#!pip install tweet-preprocessor\n",
- "import preprocessor as ppr"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Starting code\n",
- "Below we define some starting codes (python classes and function) to illustrate and assist on how to fetch data from twitter and analyse them. \n",
- "\n",
- "### **Your task is**\n",
- "1. Go through the code and understand it. Know what each function does\n",
- "2. If you find error, fix it. Ask for help in the slack channel if you find serious mistake\n",
- "3. Extend the code such that it will be useful for topics you choose to analyse\n",
- "4. Make nice plots and share your finding (e.g. insight on the main covid19 twitter converstions about your country)\n",
- "5. Submit what ever you managed to do by Wednesday morning. But you should keep using what you build to write blogs, share on facebook, etc. \n",
- "\n",
- "\n",
- "[Reference used to build some of the functions here](https://towardsdatascience.com/extracting-twitter-data-pre-processing-and-sentiment-analysis-using-python-3-0-7192bd8b47cf)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 38,
- "metadata": {},
- "outputs": [],
- "source": [
- "class tweetsearch():\n",
- " '''\n",
- " This is a basic class to search and download twitter data.\n",
- " You can build up on it to extend the functionalities for more \n",
- " sophisticated analysis.\n",
- " \n",
- " '''\n",
- " def __init__(self,cols=None,auth=None):\n",
- " #\n",
- " if not cols is None:\n",
- " self.cols = cols\n",
- " else:\n",
- " self.cols = ['id', 'created_at', 'source', 'original_text','clean_text', \n",
- " 'sentiment','polarity','subjectivity', 'lang',\n",
- " 'favorite_count', 'retweet_count', 'original_author', \n",
- " 'possibly_sensitive', 'hashtags',\n",
- " 'user_mentions', 'place', 'place_coord_boundaries']\n",
- " \n",
- " if auth is None:\n",
- " #Variables that contains the user credentials to access Twitter API \n",
- " consumer_key = os.environ.get('TWITTER_API_KEY')\n",
- " consumer_secret = os.environ.get('TWITTER_API_SECRET')\n",
- " access_token = os.environ.get('TWITTER_ACCESS_TOKEN')\n",
- " access_token_secret = os.environ.get('TWITTER_ACCESS_TOKEN_SECRET')\n",
- "\n",
- "\n",
- " #This handles Twitter authetification and the connection to Twitter \n",
- " #Streaming API\n",
- " auth = OAuthHandler(consumer_key, consumer_secret)\n",
- " auth.set_access_token(access_token, access_token_secret)\n",
- " \n",
- "\n",
- " # \n",
- " self.auth = auth\n",
- " self.api = tweepy.API(auth) \n",
- " \n",
- "\n",
- " def clean_tweets(self,twitter_text):\n",
- "\n",
- " #use pre processor\n",
- " tweet = p.clean(twitter_text)\n",
- "\n",
- " #HappyEmoticons\n",
- " emoticons_happy = set([\n",
- " ':-)', ':)', ';)', ':o)', ':]', ':3', ':c)', ':>', '=]', '8)', '=)', ':}',\n",
- " ':^)', ':-D', ':D', '8-D', '8D', 'x-D', 'xD', 'X-D', 'XD', '=-D', '=D',\n",
- " '=-3', '=3', ':-))', \":'-)\", \":')\", ':*', ':^*', '>:P', ':-P', ':P', 'X-P',\n",
- " 'x-p', 'xp', 'XP', ':-p', ':p', '=p', ':-b', ':b', '>:)', '>;)', '>:-)',\n",
- " '<3'\n",
- " ])\n",
- "\n",
- " # Sad Emoticons\n",
- " emoticons_sad = set([\n",
- " ':L', ':-/', '>:/', ':S', '>:[', ':@', ':-(', ':[', ':-||', '=L', ':<',\n",
- " ':-[', ':-<', '=\\\\', '=/', '>:(', ':(', '>.<', \":'-(\", \":'(\", ':\\\\', ':-c',\n",
- " ':c', ':{', '>:\\\\', ';('\n",
- " ])\n",
- "\n",
- " #Emoji patterns\n",
- " emoji_pattern = re.compile(\"[\"\n",
- " u\"\\U0001F600-\\U0001F64F\" # emoticons\n",
- " u\"\\U0001F300-\\U0001F5FF\" # symbols & pictographs\n",
- " u\"\\U0001F680-\\U0001F6FF\" # transport & map symbols\n",
- " u\"\\U0001F1E0-\\U0001F1FF\" # flags (iOS)\n",
- " u\"\\U00002702-\\U000027B0\"\n",
- " u\"\\U000024C2-\\U0001F251\"\n",
- " \"]+\", flags=re.UNICODE)\n",
- "\n",
- " #combine sad and happy emoticons\n",
- " emoticons = emoticons_happy.union(emoticons_sad)\n",
- "\n",
- " stop_words = set(stopwords.words('english'))\n",
- " word_tokens = word_tokenize(tweet)\n",
- " #after tweepy preprocessing the colon symbol left remain after \n",
- " #removing mentions\n",
- " tweet = re.sub(r':', '', tweet)\n",
- " tweet = re.sub(r'…', '', tweet)\n",
- "\n",
- " #replace consecutive non-ASCII characters with a space\n",
- " tweet = re.sub(r'[^\\x00-\\x7F]+',' ', tweet)\n",
- "\n",
- " #remove emojis from tweet\n",
- " tweet = emoji_pattern.sub(r'', tweet)\n",
- "\n",
- " #filter using NLTK library append it to a string\n",
- " filtered_tweet = [w for w in word_tokens if not w in stop_words]\n",
- "\n",
- " #looping through conditions\n",
- " filtered_tweet = [] \n",
- " for w in word_tokens:\n",
- " #check tokens against stop words , emoticons and punctuations\n",
- " if w not in stop_words and w not in emoticons and w not in string.punctuation:\n",
- " filtered_tweet.append(w)\n",
- "\n",
- " return ' '.join(filtered_tweet) \n",
- "\n",
- " def get_tweets(self, keyword, csvfile=None):\n",
- " \n",
- " \n",
- " df = pd.DataFrame(columns=self.cols)\n",
- " \n",
- " if not csvfile is None:\n",
- " #If the file exists, then read the existing data from the CSV file.\n",
- " if os.path.exists(csvfile):\n",
- " df = pd.read_csv(csvfile, header=0)\n",
- " \n",
- "\n",
- " #page attribute in tweepy.cursor and iteration\n",
- " for page in tweepy.Cursor(api.search, q=keyword,count=200, include_rts=False):\n",
- "\n",
- "\n",
- " for status in page:\n",
- " \n",
- " new_entry = []\n",
- " status = status._json\n",
- " \n",
- " #filter by language\n",
- " if status['lang'] != 'en':\n",
- " continue\n",
- "\n",
- " \n",
- " #if this tweet is a retweet update retweet count\n",
- " if status['created_at'] in df['created_at'].values:\n",
- " i = df.loc[df['created_at'] == status['created_at']].index[0]\n",
- " #\n",
- " cond1 = status['favorite_count'] != df.at[i, 'favorite_count']\n",
- " cond2 = status['retweet_count'] != df.at[i, 'retweet_count']\n",
- " if cond1 or cond2:\n",
- " df.at[i, 'favorite_count'] = status['favorite_count']\n",
- " df.at[i, 'retweet_count'] = status['retweet_count']\n",
- " continue\n",
- "\n",
- " #calculate sentiment\n",
- " blob = TextBlob(filtered_tweet)\n",
- " Sentiment = blob.sentiment \n",
- " polarity = Sentiment.polarity\n",
- " subjectivity = Sentiment.subjectivity\n",
- "\n",
- " new_entry += [status['id'], status['created_at'],\n",
- " status['source'], status['text'],filtered_tweet, \n",
- " Sentiment,polarity,subjectivity, status['lang'],\n",
- " status['favorite_count'], status['retweet_count']]\n",
- "\n",
- " new_entry.append(status['user']['screen_name'])\n",
- "\n",
- " try:\n",
- " is_sensitive = status['possibly_sensitive']\n",
- " except KeyError:\n",
- " is_sensitive = None\n",
- "\n",
- " new_entry.append(is_sensitive)\n",
- "\n",
- " hashtags = \", \".join([hashtag_item['text'] for \\\n",
- " hashtag_item in status['entities']['hashtags']])\n",
- " new_entry.append(hashtags) #append the hashtags\n",
- "\n",
- " #\n",
- " mentions = \", \".join([mention['screen_name'] for \\\n",
- " mention in status['entities']['user_mentions']])\n",
- " new_entry.append(mentions) #append the user mentions\n",
- "\n",
- " try:\n",
- " xyz = status['place']['bounding_box']['coordinates']\n",
- " coordinates = [coord for loc in xyz for coord in loc]\n",
- " except TypeError:\n",
- " coordinates = None\n",
- " #\n",
- " new_entry.append(coordinates)\n",
- "\n",
- " try:\n",
- " location = status['user']['location']\n",
- " except TypeError:\n",
- " location = ''\n",
- " #\n",
- " new_entry.append(location)\n",
- "\n",
- " #now append a row to the dataframe\n",
- " single_tweet_df = pd.DataFrame([new_entry], columns=self.cols)\n",
- " df = df.append(single_tweet_df, ignore_index=True)\n",
- "\n",
- " if not csvfile is None:\n",
- " #save it to file\n",
- " df.to_csv(csvfile, columns=self.cols, index=False, encoding=\"utf-8\")\n",
- " \n",
- " return df\n",
- "\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Search twitter and fetch data example"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "covid_keywords = '#COVID19 OR #COVID19Africa' #hashtag based search\n",
- "tweets_file = 'data/ethiopia_covid19_23june2020.json'\n",
- "\n",
- "#get data on keywords\n",
- "ts = tweetsearch()\n",
- "df = ts.get_tweets(covid_keywords, csvfile=tweets_file) #you saved the "
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Stream data and save it to file\n",
- "In the above we saw how to search and fetch data, below we will see how we will stream data from twitter. Make sure you understand the difference between search and stream features of twitter api.\n",
- "\n",
- "### **SAME TASK AS ABOVE**\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 41,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "r6lcy009rX_e"
- },
- "outputs": [],
- "source": [
- "#This is a basic listener that writes received tweets to file.\n",
- "class StdOutListener(StreamListener):\n",
- "\n",
- " def __init__(self,fhandle, stop_at = 1000):\n",
- " self.tweet_counter = 0\n",
- " self.stop_at = stop_at\n",
- " self.fhandle = fhandle\n",
- " \n",
- " \n",
- " def on_data(self, data):\n",
- " self.fhandle.write(data)\n",
- " \n",
- " #stop if enough tweets are obtained\n",
- " self.tweet_counter += 1 \n",
- " if self.tweet_counter < self.stop_at: \n",
- " return True\n",
- " else:\n",
- " print('Max number of tweets reached: #tweets = ' + str(self.tweet_counter))\n",
- " return False\n",
- "\n",
- " def on_error(self, status):\n",
- " print (status)\n",
- "\n",
- "def stream_tweet_data(filename='data/tweets.json',\n",
- " keywords=['COVID19Africa','COVID19Ethiopia'],\n",
- " is_async=False):\n",
- " # tweet topics to use as a filter. The tweets downloaded\n",
- " # will have one of the topics in their text or hashtag \n",
- "\n",
- " print('saving data to file: ',filename)\n",
- "\n",
- " #print the tweet topics \n",
- " print('TweetKeywords are: ',keywords)\n",
- " print('For testing case, please interupt the downloading process using ctrl+x after about 5 mins ')\n",
- " print('To keep streaming in the background, pass is_async=True')\n",
- "\n",
- " #Variables that contains the user credentials to access Twitter API \n",
- " consumer_key = os.environ.get('TWITTER_API_KEY')\n",
- " consumer_secret = os.environ.get('TWITTER_API_SECRET')\n",
- " access_token = os.environ.get('TWITTER_ACCESS_TOKEN')\n",
- " access_token_secret = os.environ.get('TWITTER_ACCESS_TOKEN_SECRET')\n",
- "\n",
- " #open file \n",
- " fhandle=open(filename,'w')\n",
- "\n",
- " #This handles Twitter authetification and the connection to Twitter Streaming API\n",
- " l = StdOutListener(fhandle)\n",
- " auth = OAuthHandler(consumer_key, consumer_secret)\n",
- " auth.set_access_token(access_token, access_token_secret)\n",
- "\n",
- " stream = Stream(auth, l)\n",
- "\n",
- " #This line filter Twitter Streams to capture data by the keywords: first argument to this code\n",
- " stream.filter(track=keywords,is_async=is_async)\n",
- "\n",
- " return None\n",
- "\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Use case of the above code"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 42,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "F8tcPcSMrNLL",
- "outputId": "d7abd9c2-065c-40e8-f71b-e808d985c364"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "saving data to file: data/covid19_23june2020.json\n",
- "TweetKeywords are: ['covid19', '#COVID19Africa']\n",
- "For testing case, please interupt the downloading process using ctrl+x after about 5 mins \n",
- "To keep streaming in the background, pass is_async=True\n",
- "Max number of tweets reached: #tweets = 1000\n"
- ]
- }
- ],
- "source": [
- "tweets_file = 'data/covid19_23june2020.json'\n",
- "stream_tweet_data(filename=tweets_file,keywords=['covid19','#COVID19Africa']) #\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Filter twitter data and do basic analysis\n",
- "**Extend it to gain more insight**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 27,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "F8tcPcSMrNLL",
- "outputId": "d7abd9c2-065c-40e8-f71b-e808d985c364"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "saved numbers of tweets: 998\n"
- ]
- }
- ],
- "source": [
- "tweets_data = []\n",
- "for line in open(tweets_file, \"r\"):\n",
- " try:\n",
- " tweet = json.loads(line)\n",
- " x=tweet['text']\n",
- " tweets_data.append(tweet)\n",
- " except:\n",
- " continue\n",
- "\n",
- "\n",
- "print('saved numbers of tweets: ', len(tweets_data))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 30,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "hlFKyGnYrNLX"
- },
- "outputs": [],
- "source": [
- "tweets = pd.DataFrame(columns=['text','lang','country'])\n",
- "\n",
- "tweets['text'] = list(map(lambda tweet: tweet['text'], tweets_data))\n",
- "tweets['lang'] = list(map(lambda tweet: tweet['lang'], tweets_data))\n",
- "tweets['country'] = list(map(lambda tweet: tweet['place']['country'] if tweet['place'] != None else None, \n",
- " tweets_data))\n",
- "\n",
- "\n",
- "tweets_by_lang = tweets['lang'].value_counts()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 31,
- "metadata": {
- "colab": {},
- "colab_type": "code",
- "id": "aEPPoCBtrNLd",
- "outputId": "bfe0cee1-814d-4f30-f47d-205218b3adc9"
- },
- "outputs": [
- {
- "data": {
- "text/plain": [
- "