changes to use hs2 interface and to run suite in a loop by epkalyanr · Pull Request #5 · hdinsight/HivePerformanceAutomation

epkalyanr · 2016-10-13T17:43:02Z

No description provided.

epkalyanr · 2016-10-13T17:44:01Z

epkalyanr · 2016-10-13T17:46:44Z

@abhijith31

dharmeshkakadia · 2016-10-14T20:56:59Z

 echo "Completed Running PerfData Collection Scripts"

-zip -r $BENCH_HOME/$BENCHMARK/PerfData.zip $PERFDATA_OUTPUTDIR
+zip -r $BENCH_HOME/$BENCHMARK/PerfData_$RUN_ID.zip $PERFDATA_OUTPUTDIR


We currently Zip full path in the zip (e.g. home/hdiuser/hive-testbench/PerfData_2/pat/tpch_query_2/.... ). Can we correct the zipping to not include the unnecessary /hdiuser/hive-testbench/ ?

dharmeshkakadia · 2016-10-14T20:58:27Z

-chmod -R 777 $RESULT_DIR

-LOG_DIR=$BENCH_HOME/$BENCHMARK/logs/
+LOG_DIR=$BENCH_HOME/$BENCHMARK/logs_$RUN_ID/


Can we include everything about one run under a single dir?

dharmeshkakadia · 2016-10-14T20:58:59Z


-RESULT_DIR=$BENCH_HOME/$BENCHMARK/results/
+RESULT_DIR=$BENCH_HOME/$BENCHMARK/results_$RUN_ID/



Can we include everything about one run under a single dir?

dharmeshkakadia · 2016-10-14T20:59:20Z

@@ -0,0 +1,22 @@
+#!/bin/bash
+#usage: ./RunSingleQueryLoop QUERY_NUMBER REPEAT_COUNT SCALCE_FACTOR CLUSTER_SSH_PASSWORD


Wrong usage.

dharmeshkakadia · 2016-10-14T21:00:44Z


-PLAN_DIR=$BENCH_HOME/$BENCHMARK/plans/
+PLAN_DIR=$BENCH_HOME/$BENCHMARK/plans_$RUN_ID/



same as above. Under single dir?

dharmeshkakadia · 2016-10-14T21:01:05Z

+		fi

-		timeout ${TIMEOUT} hive -i ${HIVE_SETTING} --database ${DATABASE} -d EXPLAIN="" -f ${QUERY_DIR}/tpch_query${2}.sql > ${RESULT_DIR}/${DATABASE}_query${j}.txt 2>&1
+		 beeline -u ${CONNECTION_STRING} -i ${HIVE_SETTING} --hivevar EXPLAIN="" -f ${QUERY_DIR}/tpch_query${2}.sql > ${RESULT_DIR}/${DATABASE}_query${j}.txt 2>&1


nit: extra space at the start.

dharmeshkakadia · 2016-10-14T21:02:54Z


-hive -d DB=${DATABASE} -f gettpchtablecounts.sql > ${STATS_DIR}/tablecounts_${DATABASE}.txt ;
-hive -d DB=${DATABASE} -f gettpchtableinfo.sql >> ${STATS_DIR}/tableinfo_${DATABASE}.txt ;
+CONNECTION_STRING="jdbc:hive2://localhost:10001/${DATABASE};transportMode=http"


Will this work in case of failover?

dharmeshkakadia · 2016-10-14T21:03:46Z

 if [ $? -ne 0 ]; then
 	echo "Generating data at scale factor $SCALE."
-	(cd tpch-gen; hadoop jar target/*.jar -d ${DIR}/${SCALE}/ -s ${SCALE})
+	(cd tpch-gen; hadoop jar target/*.jar -D mapreduce.map.memory.mb=8192 -d ${DIR}/${SCALE}/ -s ${SCALE})


We should not hard code settings here. May be have a global variable or something if you really want.

dharmeshkakadia · 2016-10-14T21:05:03Z

-runcommand "hive -i settings/load-flat.sql -f ddl-tpch/bin_flat/alltables.sql -d DB=tpch_text_${SCALE} -d LOCATION=${DIR}/${SCALE}"
+
+DATABASE=tpch_text_${SCALE}
+CONNECTION_STRING="jdbc:hive2://localhost:10001/$DATABASE;transportMode=http"


Same as above.
Also, may be we should have all of these settings in a config file rather than repeating it everytime. This is prone to error.

changes to use hs2 interface and to run suite in a loop

542659b

dharmeshkakadia suggested changes Oct 14, 2016

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

changes to use hs2 interface and to run suite in a loop#5

changes to use hs2 interface and to run suite in a loop#5
epkalyanr wants to merge 1 commit into
masterfrom
tooling-improvements2

epkalyanr commented Oct 13, 2016

Uh oh!

epkalyanr commented Oct 13, 2016

Uh oh!

epkalyanr commented Oct 13, 2016

Uh oh!

dharmeshkakadia Oct 14, 2016

Uh oh!

dharmeshkakadia Oct 14, 2016

Uh oh!

dharmeshkakadia Oct 14, 2016

Uh oh!

dharmeshkakadia Oct 14, 2016

Uh oh!

dharmeshkakadia Oct 14, 2016

Uh oh!

dharmeshkakadia Oct 14, 2016

Uh oh!

dharmeshkakadia Oct 14, 2016

Uh oh!

dharmeshkakadia Oct 14, 2016

Uh oh!

dharmeshkakadia Oct 14, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		RESULT_DIR=$BENCH_HOME/$BENCHMARK/results/
		RESULT_DIR=$BENCH_HOME/$BENCHMARK/results_$RUN_ID/

		@@ -0,0 +1,22 @@
		#!/bin/bash
		#usage: ./RunSingleQueryLoop QUERY_NUMBER REPEAT_COUNT SCALCE_FACTOR CLUSTER_SSH_PASSWORD


		PLAN_DIR=$BENCH_HOME/$BENCHMARK/plans/
		PLAN_DIR=$BENCH_HOME/$BENCHMARK/plans_$RUN_ID/

Conversation

epkalyanr commented Oct 13, 2016

Uh oh!

epkalyanr commented Oct 13, 2016

Uh oh!

epkalyanr commented Oct 13, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants