-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathparams.h
More file actions
97 lines (67 loc) · 2.77 KB
/
params.h
File metadata and controls
97 lines (67 loc) · 2.77 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
//Notes on optimizations:
//Optimizations STAMP and LINEARSTAMP in the following papers:
//1) Gowanlock, Michael, and Karsin, Ben. "Accelerating the similarity self-join using the GPU."
//Journal of parallel and distributed computing 133 (2019): 107-123.
//2) Gowanlock, Michael, and Karsin, Ben. "GPU accelerated self-join for the distance similarity metric."
//2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 2018.
//Optimizations SORT, REORDER, SHORTCIRCUIT in the following paper:
//1) Gowanlock, Michael, and Karsin, Ben. "GPU-Accelerated Similarity Self-Join for Multi-Dimensional Data."
//Proceedings of the 15th International Workshop on Data Management on New Hardware. 2019.
//Optimizations ILP, QUERYREORDER
//In a paper under reivew (to be updated with reference upon acceptance)
//Kernel block size
#define BLOCKSIZE 256
//Number of dimensions of the data (n)
#define GPUNUMDIM 100
//Number of indexed dimensions (k)
#define NUMINDEXEDDIM 6
//data type of the input dataset (float or double)
#define DTYPE double
///////////////////////
//Utility
//used for outputting the neighbortable at the end
#define PRINTNEIGHBORTABLE 0
///////////////////////
///////////////////////
//Optimizations:
//unidirectional comparison (unicomp)
#define STAMP 0
//Another version of unicomp in JPDC2019 paper (worse performance than unicomp)
#define LINEARSTAMP 0
//Sortidu
#define SORT 0
//Reorder the data by dimensionality
#define REORDER 1
//For ILP in distance calculations
#define ILP 8 //0-default no ILP
//The number is the number of registers/cached elements
//Short circuit the distance calculation
#define SHORTCIRCUIT 1
//Reorder the query points by work
#define QUERYREORDER 1
//End optimizations
///////////////////////
///////////////////////
//Flags used in performance evaluations for papers
//Do not use when timing algorithm
#define SEARCHFILTTIME 0
//used to see how many point comparisons and grid cell searches
//For performance evaluation purposes, and not when timing the algorithm
#define COUNTMETRICS 0
//Data type for the above
#define CTYPE unsigned long long
///////////////////////
///////////////////////
//Batching scheme
//Result set buffer size, one buffer of this size per GPU stream
#define GPUBUFFERSIZE 100000000 //Default 100000000
//number of concurrent gpu streams
#define GPUSTREAMS 3 //Default 3
//Minimum number of batches (used to mitigate against pinned memory dominating response time for low epsilon)
//See performance model in JPDC2019 paper
#define MINBATCHES 3 //Default 3
//Fraction of dataset sampled to estimate result set size
//Can be increased if estimate of the total result set is inaccurate.
#define SAMPLERATE 0.015 //Default 0.015
//end batching scheme
///////////////////////