From 293501fd35b20fe725f189b9745a5961fa9f4cde Mon Sep 17 00:00:00 2001 From: kraman88 Date: Thu, 3 Nov 2016 18:29:39 -0700 Subject: [PATCH 1/3] update readme --- README.md | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 5dde6c4..c084459 100644 --- a/README.md +++ b/README.md @@ -8,10 +8,34 @@ A testbench for experimenting with Apache Hive at any data scale. You can deploy Kickoff ======= -You can kickoff the 10 iterations of 1 TB run with +* You can kickoff the 10 iterations of 1 TB run with ``curl https://raw.githubusercontent.com/hdinsight/HivePerformanceAutomation/master/tpch-scripts/RunTpch.sh | bash 1000 PASSWORD 10 `` +* You can run one iteration of the suite from the headnode of your cluster by running the following command from hive-testbench/tpch-scripts + +./RunQueriesAndCollectPATData.sh SCALE_FACTOR CLUSTER_SSH_PASSWORD [RUN_ID] +RUN_ID is an optional argument. All the data for a run is collected under the hive-testbench/run_$RUN_ID folder. If the RUN_ID is not supplied then the current timestamp is used as the runId. + +* You can run multiple iterations of the suite from the headnode of your cluster the following command from hive-testbench/tpch-scripts + +./RunSuiteLoop REPEAT_COUNT SCALE_FACTOR CLUSTER_SSH_PASSWORD +for ex: ./RunSuiteLoop 10 1000 H@doop1234 +would run 10 iterations of the TPCH suite for a scale factor of 1000 (1GB) + +* You can Run a Single Query from the headnode of your cluster by running the following command from hive-testbench/tpch-scripts + +./TpchQueryExecute.sh SCALE_FACTOR QUERY_NUMBER [RUN_ID] + +* You can run multiple iterations of a single query by running the following command from hive-testbench/tpch-scripts + +./RunSingleQueryLoop QUERY_NUMBER REPEAT_COUNT SCALCE_FACTOR CLUSTER_SSH_PASSWORD + +* you can run the following script from hive-testbench/tpch-scripts to get the perf data collection for a run + + ./CollectPerfData.sh RUN_ID [RESULTS_DIR] [PERFDATA_OUTPUTDIR] [SERVER] +RUN_ID is the id of the run for which you want to collect perfdata + Overview ======== From 71715c97f1550502619dc65133926e8d5d070a52 Mon Sep 17 00:00:00 2001 From: kraman88 Date: Thu, 3 Nov 2016 18:37:49 -0700 Subject: [PATCH 2/3] update readme2 --- README.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index c084459..57b11d1 100644 --- a/README.md +++ b/README.md @@ -14,26 +14,26 @@ Kickoff * You can run one iteration of the suite from the headnode of your cluster by running the following command from hive-testbench/tpch-scripts -./RunQueriesAndCollectPATData.sh SCALE_FACTOR CLUSTER_SSH_PASSWORD [RUN_ID] +``./RunQueriesAndCollectPATData.sh SCALE_FACTOR CLUSTER_SSH_PASSWORD [RUN_ID]`` RUN_ID is an optional argument. All the data for a run is collected under the hive-testbench/run_$RUN_ID folder. If the RUN_ID is not supplied then the current timestamp is used as the runId. * You can run multiple iterations of the suite from the headnode of your cluster the following command from hive-testbench/tpch-scripts -./RunSuiteLoop REPEAT_COUNT SCALE_FACTOR CLUSTER_SSH_PASSWORD +``./RunSuiteLoop REPEAT_COUNT SCALE_FACTOR CLUSTER_SSH_PASSWORD`` for ex: ./RunSuiteLoop 10 1000 H@doop1234 would run 10 iterations of the TPCH suite for a scale factor of 1000 (1GB) * You can Run a Single Query from the headnode of your cluster by running the following command from hive-testbench/tpch-scripts -./TpchQueryExecute.sh SCALE_FACTOR QUERY_NUMBER [RUN_ID] +``./TpchQueryExecute.sh SCALE_FACTOR QUERY_NUMBER [RUN_ID]`` * You can run multiple iterations of a single query by running the following command from hive-testbench/tpch-scripts -./RunSingleQueryLoop QUERY_NUMBER REPEAT_COUNT SCALCE_FACTOR CLUSTER_SSH_PASSWORD +``./RunSingleQueryLoop QUERY_NUMBER REPEAT_COUNT SCALCE_FACTOR CLUSTER_SSH_PASSWORD`` -* you can run the following script from hive-testbench/tpch-scripts to get the perf data collection for a run +* you can run the following script from hive-testbench/tpch-scripts to get the perf data collected for a run - ./CollectPerfData.sh RUN_ID [RESULTS_DIR] [PERFDATA_OUTPUTDIR] [SERVER] + ``./CollectPerfData.sh RUN_ID [RESULTS_DIR] [PERFDATA_OUTPUTDIR] [SERVER]`` RUN_ID is the id of the run for which you want to collect perfdata Overview From 0e75dc0bc853e412fa1711abd9cd624607d768bb Mon Sep 17 00:00:00 2001 From: kalyanr Date: Thu, 3 Nov 2016 18:56:40 -0700 Subject: [PATCH 3/3] Update README.md --- README.md | 32 +++++++++++++++++--------------- 1 file changed, 17 insertions(+), 15 deletions(-) diff --git a/README.md b/README.md index 57b11d1..f15e5a8 100644 --- a/README.md +++ b/README.md @@ -10,31 +10,33 @@ Kickoff ======= * You can kickoff the 10 iterations of 1 TB run with -``curl https://raw.githubusercontent.com/hdinsight/HivePerformanceAutomation/master/tpch-scripts/RunTpch.sh | bash 1000 PASSWORD 10 `` + ``curl https://raw.githubusercontent.com/hdinsight/HivePerformanceAutomation/master/tpch-scripts/RunTpch.sh | bash 1000 PASSWORD 10 `` -* You can run one iteration of the suite from the headnode of your cluster by running the following command from hive-testbench/tpch-scripts +* You can run one iteration of the suite from the headnode of your cluster by running the following command from ``hive-testbench/tpch-scripts`` -``./RunQueriesAndCollectPATData.sh SCALE_FACTOR CLUSTER_SSH_PASSWORD [RUN_ID]`` -RUN_ID is an optional argument. All the data for a run is collected under the hive-testbench/run_$RUN_ID folder. If the RUN_ID is not supplied then the current timestamp is used as the runId. + ``./RunQueriesAndCollectPATData.sh SCALE_FACTOR CLUSTER_SSH_PASSWORD [RUN_ID]`` -* You can run multiple iterations of the suite from the headnode of your cluster the following command from hive-testbench/tpch-scripts + ``RUN_ID`` is an optional argument. All the data for a run is collected under the ``hive-testbench/run_$RUN_ID`` folder. If the ``RUN_ID`` is not supplied then the current timestamp is used as the runId. -``./RunSuiteLoop REPEAT_COUNT SCALE_FACTOR CLUSTER_SSH_PASSWORD`` -for ex: ./RunSuiteLoop 10 1000 H@doop1234 -would run 10 iterations of the TPCH suite for a scale factor of 1000 (1GB) +* You can run multiple iterations of the suite from the headnode of your cluster the following command from ``hive-testbench/tpch-scripts`` -* You can Run a Single Query from the headnode of your cluster by running the following command from hive-testbench/tpch-scripts + ``./RunSuiteLoop REPEAT_COUNT SCALE_FACTOR CLUSTER_SSH_PASSWORD`` -``./TpchQueryExecute.sh SCALE_FACTOR QUERY_NUMBER [RUN_ID]`` + for ex: ``./RunSuiteLoop 10 1000 H@doop1234`` + would run 10 iterations of the TPCH suite for a scale factor of 1000 (1TB) -* You can run multiple iterations of a single query by running the following command from hive-testbench/tpch-scripts +* You can Run a Single Query from the headnode of your cluster by running the following command from ``hive-testbench/tpch-scripts`` -``./RunSingleQueryLoop QUERY_NUMBER REPEAT_COUNT SCALCE_FACTOR CLUSTER_SSH_PASSWORD`` + ``./TpchQueryExecute.sh SCALE_FACTOR QUERY_NUMBER [RUN_ID]`` -* you can run the following script from hive-testbench/tpch-scripts to get the perf data collected for a run +* You can run multiple iterations of a single query by running the following command from ``hive-testbench/tpch-scripts`` - ``./CollectPerfData.sh RUN_ID [RESULTS_DIR] [PERFDATA_OUTPUTDIR] [SERVER]`` -RUN_ID is the id of the run for which you want to collect perfdata + ``./RunSingleQueryLoop QUERY_NUMBER REPEAT_COUNT SCALCE_FACTOR CLUSTER_SSH_PASSWORD`` + +* you can run the following script from ``hive-testbench/tpch-scripts`` to get the perf data collected for a run + + ``./CollectPerfData.sh RUN_ID [RESULTS_DIR] [PERFDATA_OUTPUTDIR] [SERVER]`` + ``RUN_ID`` is the id of the run for which you want to collect perfdata Overview ========