Skip to content

Commit ebd6942

Browse files
authored
Merge pull request #23 from AquaQAnalytics/schema-documentation-hw
Schema documentation hw
2 parents 715b176 + 22334c1 commit ebd6942

File tree

1 file changed

+67
-9
lines changed

1 file changed

+67
-9
lines changed

README.md

Lines changed: 67 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
# TorQ-TAQ
22

3-
The TorQ-TAQ Loader architecture is an extension to TorQ, which efficiently loads NYSE TAQ files using the streaming decompress algorithm with .Q.fpn
3+
The TorQ-TAQ Loader architecture is an extension to TorQ, which efficiently loads NYSE TAQ files using the streaming decompress algorithm with .Q.fpn.
44

5-
# Quick Installisation
5+
# Quick Installation
66

7-
To download TorQ TAQ, get latest installation script and download it to the directory where you want your codebase to live
7+
To download TorQ-TAQ, get latest installation script and download it to the directory where you want your codebase to live.
88

99
````
1010
wget https://raw.githubusercontent.com/AquaQAnalytics/TorQ-TAQ/master/installlatest.sh
@@ -37,15 +37,15 @@ Once this is complete, the necessary TorQ packages will be installed
3737
To start TorQ TAQ, run the following command within your terminal:
3838

3939
````
40-
$ ./bin/torq.sh start all
40+
$ ./deploy/bin/torq.sh start all
4141
15:36:23 | Starting discovery1...
4242
15:36:23 | Starting gateway1...
4343
15:36:23 | Starting orchestrator1...
4444
15:36:23 | Starting taqloader1...
4545
15:36:24 | Starting taqloader2...
4646
15:36:24 | Starting qmerger1...
4747
48-
$ ./bin/torq.sh summary
48+
$ ./deploy/bin/torq.sh summary
4949
TIME | PROCESS | STATUS | PID | PORT
5050
15:36:56 | discovery1 | up | 21434 | 10610
5151
15:36:57 | gateway1 | up | 21607 | 10611
@@ -66,10 +66,11 @@ Three files are currently supported: trade, quote, and national best bid offer (
6666
Depending on if the file recognised is a trade, quote or nbbo file, the way the data is saved behaves differently.
6767

6868
## Trade/NBBO ##
69-
Trade/NBBO data is laoded to the temporary HDB - tempdb, located in `deploy/tempdb/final/YYYY.MM.DD/trade|nbbo/`
7069

7170
**_NOTE:_** tempdb and hdb directories are created once a TAQ .gz file is loaded in - the paths to these directories are defined within [default.q](appconfig/settings/default.q)
7271

72+
Trade/NBBO data is laoded to the temporary HDB - tempdb, located in `deploy/tempdb/final/YYYY.MM.DD/trade|nbbo/`
73+
7374
By default, when Trade/NBBO data is loaded to the temporary HDB, it lives in the `deploy/tempdb/final/YYYY.MM.DD/` directory until all data from this day has been loaded.
7475

7576
## Quote ##
@@ -83,11 +84,68 @@ When the quote split file has been loaded successfully, it is merged using the m
8384
When the trade, nbbo and all 26 quote split files have been successfully loaded and merged, the orchestrator then calls the merge process to call the function which moves all of the loaded and merged data to the final hdb in its relevant date partition.
8485

8586
## Support Functionality ##
87+
### Saving to HDB ###
8688

87-
We have included a function called `manualmovetohdb` – This function can be called with arguments `[date;filetype]` in the orchestrator to manually move loaded data to the hdb. date is a date atom and filetype is a symbol or list of symbols (any of trade, quote, or nbbo). By default, data is only moved when all files have been successfully loaded and merged. However, this can be called to move the data at a different point in time.
89+
We have included the `manualmovetohdb` function, which allows you to manually move loaded data to the HDB by calling it with `[date;filetype]` arguments in the orchestrator. The date argument is a date atom and the filetype argument is a symbol or a list of symbols (trade, quote, or nbbo). By default, data is moved only after all files have been successfully loaded and merged. However, this function allows you to move the data at any point in time.
90+
### Changing Table Schema ###
91+
TorQ-TAQ is equipped to handle trade, quote, and national best bid offer (nbbo) files from the NYSE website, as previously noted. The functionality to customize the schema of these tables can be found in [taq.q](code/common/taq.q), enabling users to adjust the column names and datatypes to fit their requirements or to use a different format. This process involves modifying the dictionaries defined in maketaqparams. For example, the trade table schema from the NYSE format can be updated to load trade data from alternative sources that may have distinct columns or datatypes.
92+
93+
Taking the original trade data parameters from `maketaqparams`
94+
````
95+
tradeparams:defaults,(!) . flip (
96+
(`headers;`ticktime`exch`sym`cond`size`price`stop`corr`sequence`tradeid`cts`trf`parttime);
97+
(`types;"JSSSIFBIJICCJ");
98+
(`tablename;`trade);
99+
(`separator;enlist"|");
100+
(`dbdir;hdbdir); // this parameter is defined in the top level taqloader script
101+
(`symdir;symdir); // where we enumerate against
102+
(`tempdb;tempdb);
103+
(`dataprocessfunc;{[params;data] `sym`ticktime`exch`cond`size`price`stop`corr`sequence`cts`trf xcols delete from
104+
(update sym:.Q.fu[{` sv `$" " vs string x}each;sym],ticktime:params[`date]+ timeconverter[ticktime],parttime:params[`date]+ timeconverter[parttime] from data) where null ticktime});
105+
(`date;.z.d)
106+
);
107+
````
108+
In this example, we would be interested to change the headers to only include the following ```` `ticktime`exch`sym`cond`size`price`parttime ````.
88109

110+
To achieve this, the following changes were made:
111+
````
112+
tradeparams:defaults,(!) . flip (
113+
(`headers;`ticktime`exch`sym`cond`size`price`parttime);
114+
(`types;"JSSSIF J");
115+
(`tablename;`trade);
116+
(`separator;enlist"|");
117+
(`dbdir;hdbdir); // this parameter is defined in the top level taqloader script
118+
(`symdir;symdir); // where we enumerate against
119+
(`tempdb;tempdb);
120+
(`dataprocessfunc;{[params;data] `sym`ticktime`exch`cond`size`price xcols delete from
121+
(update sym:.Q.fu[{` sv `$" " vs string x}each;sym],ticktime:params[`date]+ timeconverter[ticktime],parttime:params[`date]+ timeconverter[parttime] from data) where null ticktime});
122+
(`date;.z.d)
123+
);
124+
````
125+
Now once trade data is decompressed and loaded, it will have the following schema:
126+
````
127+
q)meta trade
128+
c | t f a
129+
--------| -----
130+
date | d
131+
sym | s
132+
ticktime| p
133+
exch | s
134+
cond | s
135+
size | i
136+
price | f
137+
parttime| p
138+
````
139+
To clarify, the files dropped in the filedrop directory are compared against the names listed in the runload function within the orchestrator.
140+
````
141+
filetype: $[
142+
file like "*TRADE*";`trade;
143+
file like "*SPLITS*";`quote;
144+
file like "*NBBO*";`nbbo;
145+
[.lg.e[`fifoloader;errmsg:(string file)," is an unknown or unsupported file type"];'errmsg]];
146+
````
89147
## Example ##
90-
Here we just want to load in the NYSE trade data for date partition 2022.10.03 and manually move it to the HDB
148+
Here we only want to load in NYSE trade data for date 2022.10.03 and manually move it to the HDB.
91149

92150
1. Begin by downloading EQY_US_ALL_TRADE_20221003.gz from the NYSE website directly into your filedrop directory whilst your stacks are up
93151

@@ -163,4 +221,4 @@ tempdb
163221
````
164222

165223

166-
>An overview blog [is here](https://www.aquaq.co.uk/q/torq-taq-a-nyse-taq-loader/), further documentation is in the docs [directory](docs/torqtaqtutorial.md).
224+
>An overview blog [is here](https://www.aquaq.co.uk/q/torq-taq-a-nyse-taq-loader/), further documentation is in the [docs](docs/torqtaqtutorial.md) directory.

0 commit comments

Comments
 (0)