Skip to content

Commit fc627b6

Browse files
committed
Update a post: minikube-hadoop-cluster.md
1 parent 6fcafea commit fc627b6

File tree

1 file changed

+66
-0
lines changed

1 file changed

+66
-0
lines changed

content/post/minikube-hadoop-cluster.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -182,6 +182,72 @@ Found 2 items
182182
-rw-r--r-- 3 root supergroup 5890 2025-05-05 11:29 /output/part-r-00000
183183
```
184184

185+
## Spark SparkPi
186+
187+
```shell
188+
# Download Apache Spark
189+
$ curl -LO 'https://archive.apache.org/dist/spark/spark-3.5.5/spark-3.5.5-bin-hadoop3.tgz'
190+
$ tar -zxvf spark-3.5.5-bin-hadoop3.tgz
191+
192+
# Set Hadoop client up
193+
$ cd spark-3.5.5-bin-hadoop3
194+
$ SPARK_HOME="$(pwd)"
195+
196+
$ mkdir -p conf/hadoop
197+
$ kubectl get configmaps hdfs-conf -o jsonpath="{.data.core-site\.xml}" \
198+
>conf/hadoop/core-site.xml
199+
$ kubectl get configmaps hdfs-conf -o jsonpath="{.data.hdfs-site\.xml}" \
200+
>conf/hadoop/hdfs-site.xml
201+
$ kubectl get configmaps yarn-conf -o jsonpath="{.data.yarn-site\.xml}" \
202+
>conf/hadoop/yarn-site.xml
203+
204+
# Set Spark up
205+
$ cat >conf/spark-defaults.conf <<EOF
206+
spark.master yarn
207+
spark.submit.deployMode cluster
208+
209+
spark.eventLog.enabled true
210+
spark.eventLog.dir hdfs:///tmp/spark-events
211+
spark.history.fs.logDirectory hdfs:///tmp/spark-events
212+
EOF
213+
214+
$ cat >conf/spark-env.sh <<EOF
215+
HADOOP_CONF_DIR="${SPARK_HOME}/conf/hadoop"
216+
YARN_CONF_DIR="${SPARK_HOME}/conf/hadoop"
217+
HADOOP_USER_NAME='root'
218+
EOF
219+
220+
# Run SparkPi
221+
$ kubectl exec -it nn-0 -- hadoop fs -mkdir -p /tmp/spark-events
222+
223+
$ ./bin/run-example org.apache.spark.examples.SparkPi 10000
224+
225+
25/05/16 16:34:09 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
226+
25/05/16 16:34:09 INFO DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at rm-0.resourcemanager.default.svc.cluster.local/10.244.0.4:8032
227+
25/05/16 16:34:10 INFO Configuration: resource-types.xml not found
228+
25/05/16 16:34:10 INFO ResourceUtils: Unable to find 'resource-types.xml'.
229+
25/05/16 16:34:10 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
230+
25/05/16 16:34:10 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
231+
25/05/16 16:34:10 INFO Client: Setting up container launch context for our AM
232+
25/05/16 16:34:10 INFO Client: Setting up the launch environment for our AM container
233+
25/05/16 16:34:10 INFO Client: Preparing resources for our AM container
234+
25/05/16 16:34:10 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
235+
25/05/16 16:34:13 INFO Client: Uploading resource file:/private/var/folders/pq/ffp2x74d7gs2_y_z5v7q4dn00000gp/T/spark-5e05a1af-feba-4e50-a0ec-e05c0be847a7/__spark_libs__18048069634091976738.zip -> hdfs://hadoop-cluster/user/root/.sparkStaging/application_1747383152432_0004/__spark_libs__18048069634091976738.zip
236+
25/05/16 16:34:14 INFO Client: Uploading resource file:/spark-3.5.5-bin-hadoop3/examples/jars/spark-examples_2.12-3.5.5.jar -> hdfs://hadoop-cluster/user/root/.sparkStaging/application_1747383152432_0004/spark-examples_2.12-3.5.5.jar
237+
25/05/16 16:34:14 INFO Client: Uploading resource file:/spark-3.5.5-bin-hadoop3/examples/jars/scopt_2.12-3.7.1.jar -> hdfs://hadoop-cluster/user/root/.sparkStaging/application_1747383152432_0004/scopt_2.12-3.7.1.jar
238+
25/05/16 16:34:14 WARN Client: Same name resource file:///spark-3.5.5-bin-hadoop3/examples/jars/spark-examples_2.12-3.5.5.jar added multiple times to distributed cache
239+
25/05/16 16:34:14 INFO Client: Uploading resource file:/private/var/folders/pq/ffp2x74d7gs2_y_z5v7q4dn00000gp/T/spark-5e05a1af-feba-4e50-a0ec-e05c0be847a7/__spark_conf__8161596058161130912.zip -> hdfs://hadoop-cluster/user/root/.sparkStaging/application_1747383152432_0004/__spark_conf__.zip
240+
25/05/16 16:34:14 INFO SecurityManager: Changing view acls to: adonis,root
241+
25/05/16 16:34:14 INFO SecurityManager: Changing modify acls to: adonis,root
242+
25/05/16 16:34:14 INFO SecurityManager: Changing view acls groups to:
243+
25/05/16 16:34:14 INFO SecurityManager: Changing modify acls groups to:
244+
25/05/16 16:34:14 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: adonis, root; groups with view permissions: EMPTY; users with modify permissions: adonis, root; groups with modify permissions: EMPTY
245+
25/05/16 16:34:14 INFO Client: Submitting application application_1747383152432_0004 to ResourceManager
246+
25/05/16 16:34:14 INFO YarnClientImpl: Submitted application application_1747383152432_0004
247+
25/05/16 16:34:15 INFO Client: Application report for application_1747383152432_0004 (state: ACCEPTED)
248+
...
249+
```
250+
185251
# Reference
186252

187253
- [Accessing services in minikube via DNS](https://www.andreasgerstmayr.at/2022/11/23/accessing-services-in-minikube-via-dns.html)

0 commit comments

Comments
 (0)