Skip to content

Commit ca68c75

Browse files
committed
fix: Migrate .ipynb files from LFS to regular Git tracking
- Removes *.ipynb from .gitattributes LFS tracking - Migrates all 141 notebook files from LFS to regular Git - Fixes GitHub Pages display issue where LFS pointer content was shown - Notebooks will now display proper JSON content instead of LFS metadata - Added migration script for future reference Resolves GitHub Pages notebook display issues
1 parent 74c68db commit ca68c75

File tree

143 files changed

+249430
-424
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

143 files changed

+249430
-424
lines changed

.gitattributes

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
*.ipynb filter=lfs diff=lfs merge=lfs -text
21
*.csv filter=lfs diff=lfs merge=lfs -text
32
*.mp4 filter=lfs diff=lfs merge=lfs -text
43
*.keras filter=lfs diff=lfs merge=lfs -text
Lines changed: 273 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,273 @@
1-
version https://git-lfs.github.com/spec/v1
2-
oid sha256:8b51872ebfef34e927bbb37fbd616b5c74467c36448b2aff16dc7db0e7575386
3-
size 7673
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# SpaceX Falcon 9 First Stage Landing Data Collection\n",
8+
"\n",
9+
"This notebook is part of my personal data science project. All content and analysis are original and tailored for my own exploration of SpaceX launch data."
10+
]
11+
},
12+
{
13+
"cell_type": "markdown",
14+
"metadata": {},
15+
"source": [
16+
"## Project Overview\n",
17+
"\n",
18+
"The goal is to collect, clean, and prepare SpaceX Falcon 9 launch data for further analysis and machine learning. This notebook focuses on retrieving data from the SpaceX API and performing initial wrangling."
19+
]
20+
},
21+
{
22+
"cell_type": "markdown",
23+
"metadata": {},
24+
"source": [
25+
"## Objectives\n",
26+
"- Request and collect SpaceX Falcon 9 launch data from the API\n",
27+
"- Clean and format the data for analysis\n",
28+
"- Prepare the dataset for downstream machine learning tasks"
29+
]
30+
},
31+
{
32+
"cell_type": "markdown",
33+
"metadata": {},
34+
"source": [
35+
"---"
36+
]
37+
},
38+
{
39+
"cell_type": "markdown",
40+
"metadata": {},
41+
"source": [
42+
"## Import Libraries and Define Helper Functions"
43+
]
44+
},
45+
{
46+
"cell_type": "code",
47+
"execution_count": null,
48+
"metadata": {},
49+
"outputs": [],
50+
"source": [
51+
"import requests\n",
52+
"import pandas as pd\n",
53+
"import numpy as np\n",
54+
"import datetime"
55+
]
56+
},
57+
{
58+
"cell_type": "code",
59+
"execution_count": null,
60+
"metadata": {},
61+
"outputs": [],
62+
"source": [
63+
"# Helper functions to extract details from API responses\n",
64+
"def getBoosterVersion(data):\n",
65+
" for x in data['rocket']:\n",
66+
" if x:\n",
67+
" response = requests.get(f\"https://api.spacexdata.com/v4/rockets/{x}\").json()\n",
68+
" BoosterVersion.append(response['name'])\n",
69+
"\n",
70+
"def getLaunchSite(data):\n",
71+
" for x in data['launchpad']:\n",
72+
" if x:\n",
73+
" response = requests.get(f\"https://api.spacexdata.com/v4/launchpads/{x}\").json()\n",
74+
" Longitude.append(response['longitude'])\n",
75+
" Latitude.append(response['latitude'])\n",
76+
" LaunchSite.append(response['name'])\n",
77+
"\n",
78+
"def getPayloadData(data):\n",
79+
" for load in data['payloads']:\n",
80+
" if load:\n",
81+
" response = requests.get(f\"https://api.spacexdata.com/v4/payloads/{load}\").json()\n",
82+
" PayloadMass.append(response['mass_kg'])\n",
83+
" Orbit.append(response['orbit'])\n",
84+
"\n",
85+
"def getCoreData(data):\n",
86+
" for core in data['cores']:\n",
87+
" if core['core'] is not None:\n",
88+
" response = requests.get(f\"https://api.spacexdata.com/v4/cores/{core['core']}\").json()\n",
89+
" Block.append(response['block'])\n",
90+
" ReusedCount.append(response['reuse_count'])\n",
91+
" Serial.append(response['serial'])\n",
92+
" else:\n",
93+
" Block.append(None)\n",
94+
" ReusedCount.append(None)\n",
95+
" Serial.append(None)\n",
96+
" Outcome.append(str(core['landing_success']) + ' ' + str(core['landing_type']))\n",
97+
" Flights.append(core['flight'])\n",
98+
" GridFins.append(core['gridfins'])\n",
99+
" Reused.append(core['reused'])\n",
100+
" Legs.append(core['legs'])\n",
101+
" LandingPad.append(core['landpad'])"
102+
]
103+
},
104+
{
105+
"cell_type": "markdown",
106+
"metadata": {},
107+
"source": [
108+
"## Data Collection\n",
109+
"\n",
110+
"Request SpaceX Falcon 9 launch data from the API and perform initial wrangling."
111+
]
112+
},
113+
{
114+
"cell_type": "code",
115+
"execution_count": null,
116+
"metadata": {},
117+
"outputs": [],
118+
"source": [
119+
"spacex_url = \"https://api.spacexdata.com/v4/launches/past\"\n",
120+
"response = requests.get(spacex_url)\n",
121+
"data = pd.json_normalize(response.json())"
122+
]
123+
},
124+
{
125+
"cell_type": "code",
126+
"execution_count": null,
127+
"metadata": {},
128+
"outputs": [],
129+
"source": [
130+
"# Keep only relevant columns and filter for single-core, single-payload launches\n",
131+
"data = data[['rocket', 'payloads', 'launchpad', 'cores', 'flight_number', 'date_utc']]\n",
132+
"data = data[data['cores'].map(len) == 1]\n",
133+
"data = data[data['payloads'].map(len) == 1]\n",
134+
"data['cores'] = data['cores'].map(lambda x: x[0])\n",
135+
"data['payloads'] = data['payloads'].map(lambda x: x[0])\n",
136+
"data['date'] = pd.to_datetime(data['date_utc']).dt.date\n",
137+
"data = data[data['date'] <= datetime.date(2020, 11, 13)]"
138+
]
139+
},
140+
{
141+
"cell_type": "code",
142+
"execution_count": null,
143+
"metadata": {},
144+
"outputs": [],
145+
"source": [
146+
"# Prepare lists for extracted features\n",
147+
"BoosterVersion = []\n",
148+
"PayloadMass = []\n",
149+
"Orbit = []\n",
150+
"LaunchSite = []\n",
151+
"Outcome = []\n",
152+
"Flights = []\n",
153+
"GridFins = []\n",
154+
"Reused = []\n",
155+
"Legs = []\n",
156+
"LandingPad = []\n",
157+
"Block = []\n",
158+
"ReusedCount = []\n",
159+
"Serial = []\n",
160+
"Longitude = []\n",
161+
"Latitude = []\n",
162+
"\n",
163+
"# Extract features using helper functions\n",
164+
"getBoosterVersion(data)\n",
165+
"getLaunchSite(data)\n",
166+
"getPayloadData(data)\n",
167+
"getCoreData(data)"
168+
]
169+
},
170+
{
171+
"cell_type": "code",
172+
"execution_count": null,
173+
"metadata": {},
174+
"outputs": [],
175+
"source": [
176+
"# Construct the final dataset\n",
177+
"dataset = {\n",
178+
" 'FlightNumber': list(data['flight_number']),\n",
179+
" 'Date': list(data['date']),\n",
180+
" 'BoosterVersion': BoosterVersion,\n",
181+
" 'PayloadMass': PayloadMass,\n",
182+
" 'Orbit': Orbit,\n",
183+
" 'LaunchSite': LaunchSite,\n",
184+
" 'Outcome': Outcome,\n",
185+
" 'Flights': Flights,\n",
186+
" 'GridFins': GridFins,\n",
187+
" 'Reused': Reused,\n",
188+
" 'Legs': Legs,\n",
189+
" 'LandingPad': LandingPad,\n",
190+
" 'Block': Block,\n",
191+
" 'ReusedCount': ReusedCount,\n",
192+
" 'Serial': Serial,\n",
193+
" 'Longitude': Longitude,\n",
194+
" 'Latitude': Latitude\n",
195+
"}\n",
196+
"df = pd.DataFrame(dataset)"
197+
]
198+
},
199+
{
200+
"cell_type": "markdown",
201+
"metadata": {},
202+
"source": [
203+
"## Data Cleaning\n",
204+
"\n",
205+
"Filter for Falcon 9 launches and handle missing values."
206+
]
207+
},
208+
{
209+
"cell_type": "code",
210+
"execution_count": null,
211+
"metadata": {},
212+
"outputs": [],
213+
"source": [
214+
"# Keep only Falcon 9 launches\n",
215+
"df = df[df['BoosterVersion'] != 'Falcon 1']\n",
216+
"df.loc[:, 'FlightNumber'] = list(range(1, df.shape[0] + 1))"
217+
]
218+
},
219+
{
220+
"cell_type": "code",
221+
"execution_count": null,
222+
"metadata": {},
223+
"outputs": [],
224+
"source": [
225+
"# Handle missing values in PayloadMass\n",
226+
"payload_mass_mean = df['PayloadMass'].mean()\n",
227+
"df['PayloadMass'].replace(np.nan, payload_mass_mean, inplace=True)"
228+
]
229+
},
230+
{
231+
"cell_type": "markdown",
232+
"metadata": {},
233+
"source": [
234+
"## Save Cleaned Data\n",
235+
"\n",
236+
"Export the cleaned dataset for further analysis."
237+
]
238+
},
239+
{
240+
"cell_type": "code",
241+
"execution_count": null,
242+
"metadata": {},
243+
"outputs": [],
244+
"source": [
245+
"df.to_csv('dataset-part-1.csv', index=False)"
246+
]
247+
}
248+
],
249+
"metadata": {
250+
"colab": {
251+
"provenance": []
252+
},
253+
"kernelspec": {
254+
"display_name": "Python",
255+
"language": "python",
256+
"name": "conda-env-python-py"
257+
},
258+
"language_info": {
259+
"codemirror_mode": {
260+
"name": "ipython",
261+
"version": 3
262+
},
263+
"file_extension": ".py",
264+
"mimetype": "text/x-python",
265+
"name": "python",
266+
"nbconvert_exporter": "python",
267+
"pygments_lexer": "ipython3",
268+
"version": "3.7.12"
269+
}
270+
},
271+
"nbformat": 4,
272+
"nbformat_minor": 0
273+
}

0 commit comments

Comments
 (0)