diff --git a/docs/html/Contents.html b/docs/html/Contents.html index 19c7cf7c..245fefba 100644 --- a/docs/html/Contents.html +++ b/docs/html/Contents.html @@ -1,386 +1,205 @@ - - - -Contents.md - - - - - - - - - - - - -

alt text

+ + + + + + + Contents + + + +

alt text

Table of Contents

-

1. Introduction

-

2. Data streaming architecture overview

-

3. Specification

-

4. Reference model

-

4.1 Core components

-

4.2 Output components

-

4.3 Data conversion components

-

4.4 Stream conversion components

-

4.5 Validation components

-

5. Financial Module

-

6. Workflows

-

Appendix A Random numbers

-

Appendix B FM profiles

-

Appendix C Multi-peril model support

- - - +

1. Introduction

+

2. Data streaming architecture overview

+

3. Specification

+

4. Reference model

+

4.1 Core +components

+

4.2 Output +components

+

4.3 ORD output components

+

4.4 Data conversion components

+

4.5 Stream conversion +components

+

4.6 Validation components

+

5. Financial +Module

+

6. Workflows

+

Appendix A Random numbers

+

Appendix B FM +profiles

+

Appendix C Multi-peril model support

+ + diff --git a/docs/html/CoreComponents.html b/docs/html/CoreComponents.html index 1e885f19..93e3e12c 100644 --- a/docs/html/CoreComponents.html +++ b/docs/html/CoreComponents.html @@ -1,444 +1,278 @@ - - - -CoreComponents.md - - - - - - - - - - - - -

alt text

-

4.1 Core Components

+ + + + + + + CoreComponents + + + +

alt text

+

4.1 Core Components +

eve

-
-

eve takes a list of event ids in binary format as input and generates a partition of event ids as a binary data stream, according to the parameters supplied. Events are "shuffled" by being assigned to processes one by one cyclically, rather than being distributed to processes in blocks, in the order they are input. This helps to even the workload by process, in case all the "big" events are together in the same range of the event list.

+
+

eve takes a list of event ids in binary format as input and generates +a partition of event ids as a binary data stream, according to the +parameters supplied. Events are "shuffled" by being assigned to +processes one by one cyclically, rather than being distributed to +processes in blocks, in the order they are input. This helps to even the +workload by process, in case all the "big" events are together in the +same range of the event list.

Output stream
-

The output stream is a simple list of event_ids (4 byte integers).

+

The output stream is a simple list of event_ids (4 byte +integers).

Parameters

Required parameters are;

Optional parameters are;

Usage
-
$ eve [parameters] > [output].bin -$ eve [parameters] | getmodel | gulcalc [parameters] > [stdout].bin -
+
$ eve [parameters] > [output].bin
+$ eve [parameters] | getmodel | gulcalc [parameters] > [stdout].bin
Example
-
$ eve 1 2 > events1_2_shuffled.bin +
$ eve 1 2 > events1_2_shuffled.bin 
 $ eve -n 1 2 > events1_2_unshuffled.bin 
 $ eve -r 1 2 > events1_2_random.bin
-$ eve 1 2 | getmodel | gulcalc -r -S100 -i - > gulcalc1_2.bin
-
-

In this example, the events from the file events.bin will be read into memory and the first half (partition 1 of 2) would be streamed out to binary file, or downstream to a single process calculation workflow.

+$ eve 1 2 | getmodel | gulcalc -r -S100 -i - > gulcalc1_2.bin
+

In this example, the events from the file events.bin will be read +into memory and the first half (partition 1 of 2) would be streamed out +to binary file, or downstream to a single process calculation +workflow.

Internal data
-

The program requires an event binary. The file is picked up from the input sub-directory relative to where the program is invoked and has the following filename;

+

The program requires an event binary. The file is picked up from the +input sub-directory relative to where the program is invoked and has the +following filename;

-

The data structure of events.bin is a simple list of event ids (4 byte integers).

+

The data structure of events.bin is a simple list of event ids (4 +byte integers).

Return to top

getmodel

-
-

getmodel generates a stream of effective damageability distributions (cdfs) from an input list of events. Specifically, it combines the probability distributions from the model files, footprint.bin and vulnerability.bin, to generate effective damageability cdfs for the subset of exposures contained in the items.bin file and converts them into a binary stream.

-

This is reference example of the class of programs which generates the damage distributions for an event set and streams them into memory. It is envisaged that model developers who wish to use the toolkit as a back-end calculator of their existing platforms can write their own version of getmodel, reading in their own source data and converting it into the standard output stream. As long as the standard input and output structures are adhered to, each program can be written in any language and read any input data.

-
Output stream
+
+

getmodel generates a stream of effective damageability distributions +(cdfs) from an input list of events. Specifically, it combines the +probability distributions from the model files, footprint.bin and +vulnerability.bin, to generate effective damageability cdfs for the +subset of exposures contained in the items.bin file and converts them +into a binary stream.

+

This is reference example of the class of programs which generates +the damage distributions for an event set and streams them into memory. +It is envisaged that model developers who wish to use the toolkit as a +back-end calculator of their existing platforms can write their own +version of getmodel, reading in their own source data and converting it +into the standard output stream. As long as the standard input and +output structures are adhered to, each program can be written in any +language and read any input data.

+
Output stream
- + - + - + - +
Byte 1Byte 1 Bytes 2-4DescriptionDescription
00 1cdf streamcdf stream
-
Parameters
+
Parameters

None

-
Usage
-
$ [stdin component] | getmodel | [stout component] +
Usage
+
$ [stdin component] | getmodel | [stout component]
 $ [stdin component] | getmodel > [stdout].bin
-$ getmodel < [stdin].bin > [stdout].bin
-
-
Example
-
$ eve 1 1 | getmodel | gulcalc -r -S100 -i gulcalci.bin +$ getmodel < [stdin].bin > [stdout].bin
+
Example
+
$ eve 1 1 | getmodel | gulcalc -r -S100 -i gulcalci.bin
 $ eve 1 1 | getmodel > getmodel.bin
-$ getmodel < events.bin > getmodel.bin 
-
-
Internal data
-

The program requires the footprint binary and index file for the model, the vulnerability binary model file, and the items file representing the user's exposures. The files are picked up from sub-directories relative to where the program is invoked, as follows;

+$ getmodel < events.bin > getmodel.bin
+
Internal data
+

The program requires the footprint binary and index file for the +model, the vulnerability binary model file, and the items file +representing the user's exposures. The files are picked up from +sub-directories relative to where the program is invoked, as +follows;

-

The getmodel output stream is ordered by event and streamed out in blocks for each event.

+

The getmodel output stream is ordered by event and streamed out in +blocks for each event.

Calculation
-

The program filters the footprint binary file for all areaperil_id's which appear in the items file. This selects the event footprints that impact the exposures on the basis on their location. Similarly the program filters the vulnerability file for vulnerability_id's that appear in the items file. This selects conditional damage distributions which are relevant for the exposures.

-

The intensity distributions from the footprint file and conditional damage distributions from the vulnerability file are convolved for every combination of areaperil_id and vulnerability_id in the items file. The effective damage probabilities are calculated, for each damage bin, by summing the product of conditional damage probabilities with intensity probabilities for each event, areaperil, vulnerability combination across the intensity bins.

-

The resulting discrete probability distributions are converted into discrete cumulative distribution functions 'cdfs'. Finally, the damage bin mid-point from the damage bin dictionary ('interpolation' field) is read in as a new field in the cdf stream as 'bin_mean'. This field is the conditional mean damage for the bin and it is used to choose the interpolation method for random sampling and numerical integration calculations in the gulcalc component.

+

The program filters the footprint binary file for all areaperil_id's +which appear in the items file. This selects the event footprints that +impact the exposures on the basis on their location. Similarly the +program filters the vulnerability file for vulnerability_id's that +appear in the items file. This selects conditional damage distributions +which are relevant for the exposures.

+

The intensity distributions from the footprint file and conditional +damage distributions from the vulnerability file are convolved for every +combination of areaperil_id and vulnerability_id in the items file. The +effective damage probabilities are calculated, for each damage bin, by +summing the product of conditional damage probabilities with intensity +probabilities for each event, areaperil, vulnerability combination +across the intensity bins.

+

The resulting discrete probability distributions are converted into +discrete cumulative distribution functions 'cdfs'. Finally, the damage +bin mid-point from the damage bin dictionary ('interpolation' field) is +read in as a new field in the cdf stream as 'bin_mean'. This field is +the conditional mean damage for the bin and it is used to choose the +interpolation method for random sampling and numerical integration +calculations in the gulcalc component.

Return to top

gulcalc

-
-

The gulcalc program performs Monte Carlo sampling of ground up loss by randomly sampling the cumulative probability of damage from the uniform distribution and generating damage factors by interpolation of the random numbers against the effective damage cdf. Other loss metrics are computed and assigned to special meaning sample index values as descibed below.

-

The sampling methodologies are linear interpolation, quadratic interpolation and point value sampling depending on the damage bin definitions in the input data.

-

Gulcalc also performs back-allocation of total coverage losses to the contributing subperil item losses (for multi-subperil models). This occurs when there are two or more items representing losses from different subperils to the same coverage, such as wind loss and storm surge loss, for example. In these cases, because the subperil losses are generated independently from each other it is possible to result in a total damage ratio greater than 1 for the coverage, or a total loss greated than the Total Insured Value "TIV". Back-allocation ensures that the total loss for a coverage cannot exceed the input TIV.

+
+

The gulcalc program performs Monte Carlo sampling of ground up loss +by randomly sampling the cumulative probability of damage from the +uniform distribution and generating damage factors by interpolation of +the random numbers against the effective damage cdf. Other loss metrics +are computed and assigned to special meaning sample index values as +descibed below.

+

The sampling methodologies are linear interpolation, quadratic +interpolation and point value sampling depending on the damage bin +definitions in the input data.

+

Gulcalc also performs back-allocation of total coverage losses to the +contributing subperil item losses (for multi-subperil models). This +occurs when there are two or more items representing losses from +different subperils to the same coverage, such as wind loss and storm +surge loss, for example. In these cases, because the subperil losses are +generated independently from each other it is possible to result in a +total damage ratio greater than 1 for the coverage, or a total loss +greated than the Total Insured Value "TIV". Back-allocation ensures that +the total loss for a coverage cannot exceed the input TIV.

Stream output
- + - + - + - +
Byte 1Byte 1 Bytes 2-4DescriptionDescription
22 1loss streamloss stream
-
Parameters
+
Parameters

Required parameters are;

-

The destination is either a filename or named pipe, or use - for standard output.

+

The destination is either a filename or named pipe, or use - for +standard output.

Optional parameters are;

-
Usage
-
$ [stdin component] | gulcalc [parameters] | [stout component] +
Usage
+
$ [stdin component] | gulcalc [parameters] | [stout component]
 $ [stdin component] | gulcalc [parameters]
-$ gulcalc [parameters] < [stdin].bin 
-
-
Example
-
$ eve 1 1 | getmodel | gulcalc -R1000000 -S100 -a1 -i - | fmcalc > fmcalc.bin +$ gulcalc [parameters] < [stdin].bin
+
Example
+
$ eve 1 1 | getmodel | gulcalc -R1000000 -S100 -a1 -i - | fmcalc > fmcalc.bin
 $ eve 1 1 | getmodel | gulcalc -R1000000 -S100 -a1 -i - | summarycalc -i -1 summarycalc1.bin
 $ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i gulcalci.bin
-$ gulcalc -r -S100 -i -a1 gulcalci.bin < getmodel.bin 
-
-
Internal data
-

The program requires the damage bin dictionary binary for the static folder and the item and coverage binaries from the input folder. The files are found in the following locations relative to the working directory with the filenames;

+$ gulcalc -r -S100 -i -a1 gulcalci.bin < getmodel.bin
+
Internal data
+

The program requires the damage bin dictionary binary for the static +folder and the item and coverage binaries from the input folder. The +files are found in the following locations relative to the working +directory with the filenames;

-

If the user specifies -r as a parameter, then the program also picks up a random number file from the static directory. The filename is;

+

If the user specifies -r as a parameter, then the program also picks +up a random number file from the static directory. The filename is;

-
Calculation
-

The stdin stream is a block of cdfs which are ordered by event_id, areaperil_id, vulnerability_id and bin_index ascending, from getmodel. The gulcalc program constructs a cdf for each item, based on matching the areaperil_id and vulnerability_id from the stdin and the item file.

+
Calculation
+

The stdin stream is a block of cdfs which are ordered by event_id, +areaperil_id, vulnerability_id and bin_index ascending, from getmodel. +The gulcalc program constructs a cdf for each item, based on matching +the areaperil_id and vulnerability_id from the stdin and the item +file.

Random sampling
-

Random samples are indexed using positive integers starting from 1, called the 'sidx', or sample index.

-

For each item cdf and for the number of samples specified, the program draws a uniformly distributed random number and uses it to sample ground up loss from the cdf using one of three methods, as follows;

-

For a given damage interval corresponding to a cumulative probability interval that each random number falls within;

+

Random samples are indexed using positive integers starting from 1, +called the 'sidx', or sample index.

+

For each item cdf and for the number of samples specified, the +program draws a uniformly distributed random number and uses it to +sample ground up loss from the cdf using one of three methods, as +follows;

+

For a given damage interval corresponding to a cumulative probability +interval that each random number falls within;

An example of the three cases and methods is given below;

- + - + - + - + - + - + - + - +
bin_frombin_from bin_to bin_meanMethod usedMethod used
0.10.1 0.2 0.15Linear interpolationLinear interpolation
0.10.1 0.1 0.1Sample bin valueSample bin value
0.10.1 0.2 0.14Quadratic interpolationQuadratic interpolation
-

If the -R parameter is used along with a specified number of random numbers then random numbers used for sampling are generated on the fly for each event and group of items which have a common group_id using the Mersenne twister psuedo random number generator (the default RNG of the C++ v11 compiler). These random numbers are not repeatable, unless a seed is also specified (-s{number}).

-

If the -r parameter is used, gulcalc reads a random number from the provided random number file, which produces repeatable results.

-

The default random number behaviour (no additional parameters) is to generate random numbers from a seed determined by a combination of the event_id and group_id, which produces repeatable results. See Random Numbers for more details.

-

Each sampled damage is multiplied by the item TIV, looked up from the coverage file.

+

If the -R parameter is used along with a specified number of random +numbers then random numbers used for sampling are generated on the fly +for each event and group of items which have a common group_id using the +Mersenne twister psuedo random number generator (the default RNG of the +C++ v11 compiler). These random numbers are not repeatable, unless a +seed is also specified (-s{number}).

+

If the -r parameter is used, gulcalc reads a random number from the +provided random number file, which produces repeatable results.

+

The default random number behaviour (no additional parameters) is to +generate random numbers from a seed determined by a combination of the +event_id and group_id, which produces repeatable results. See Random Numbers for more details.

+

Each sampled damage is multiplied by the item TIV, looked up from the +coverage file.

Special samples

Samples with negative indexes have special meanings as follows;

- - + + - - + + - - + + - - + + - - + + - - + +
sidxdescriptionsidxdescription
-1Numerical integration mean-1Numerical integration mean
-2Numerical integration standard deviation-2Numerical integration standard +deviation
-3Impacted exposure-3Impacted exposure
-4Chance of loss-4Chance of loss
-5Maximum loss-5Maximum loss
Allocation method
-

The allocation method determines how item losses are adjusted when a coverage is subject to losses from multiple perils, because the total loss to a coverage from mutiple perils cannot exceed the input TIV. This situation is identified when multiple item_ids in the item file share the same coverage_id. The TIV is held in the coverages file against the coverage_id and the item_id TIV is looked up from its relationship to coverage_id in the item file.

+

The allocation method determines how item losses are adjusted when a +coverage is subject to losses from multiple perils, because the total +loss to a coverage from mutiple perils cannot exceed the input TIV. This +situation is identified when multiple item_ids in the item file share +the same coverage_id. The TIV is held in the coverages file against the +coverage_id and the item_id TIV is looked up from its relationship to +coverage_id in the item file.

The allocation methods are as follows;

++++ - - + + - - + + - - + + - - + +
adescriptionadescription
0Pass losses through unadjusted (used for single peril models)0Pass losses through unadjusted (used for +single peril models)
1Sum the losses and cap them to the TIV. Back-allocate TIV to the contributing items in proportion to the unadjusted losses1Sum the losses and cap them to the TIV. +Back-allocate TIV to the contributing items in proportion to the +unadjusted losses
2Keep the maximum subperil loss and set the others to zero. Back-allocate equally when there are equal maximum losses2Keep the maximum subperil loss and set the +others to zero. Back-allocate equally when there are equal maximum +losses
-

The mean, impacted exposure and maximum loss special samples are also subject to these allocation rules. -The impacted exposure value, sidx -3, is always back-allocated equally to the items, for allocation rules 1 and 2, since by definition it is the same value for all items related to the same coverage.

+

The mean, impacted exposure and maximum loss special samples are also +subject to these allocation rules. The impacted exposure value, sidx -3, +is always back-allocated equally to the items, for allocation rules 1 +and 2, since by definition it is the same value for all items related to +the same coverage.

Return to top

fmcalc

-
-

fmcalc is the reference implementation of the Oasis Financial Module. It applies policy terms and conditions to the ground up losses and produces loss sample output. It reads in the loss stream from either gulcalc or from another fmcalc and can be called recursively and apply several consecutive sets of policy terms and conditions.

-
Stream output
+
+

fmcalc is the reference implementation of the Oasis Financial Module. +It applies policy terms and conditions to the ground up losses and +produces loss sample output. It reads in the loss stream from either +gulcalc or from another fmcalc and can be called recursively and apply +several consecutive sets of policy terms and conditions.

+
Stream output
- + - + - + - +
Byte 1Byte 1 Bytes 2-4DescriptionDescription
22 1loss streamloss stream
-
Parameters
+
Parameters

Optional parameters are;

-
Usage
-
$ [stdin component] | fmcalc [parameters] | [stout component] +
Usage
+
$ [stdin component] | fmcalc [parameters] | [stout component]
 $ [stdin component] | fmcalc [parameters] > [stdout].bin
-$ fmcalc [parameters] < [stdin].bin > [stdout].bin
-
-
Example
-
$ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | fmcalc -p direct -a2 | summarycalc -f -2 - | eltcalc > elt.csv +$ fmcalc [parameters] < [stdin].bin > [stdout].bin
+
Example
+
$ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | fmcalc -p direct -a2 | summarycalc -f -2 - | eltcalc > elt.csv
 $ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | fmcalc -p direct -a1 > fmcalc.bin
 $ fmcalc -p ri1 -a2 -S -n < gulcalci.bin > fmcalc.bin
-$ fmcalc -p direct | fmcalc -p ri1 -n | fmcalc -p ri2 -n < gulcalci.bin > fm_ri2_net.bin 
-
-
Internal data
-

For the gulcalc item stream input, the program requires the item, coverage and fm input data files, which are Oasis abstract data objects that describe an insurance or reinsurance programme. This data is picked up from the following files relative to the working directory by default;

+$ fmcalc -p direct | fmcalc -p ri1 -n | fmcalc -p ri2 -n < gulcalci.bin > fm_ri2_net.bin
+
Internal data
+

For the gulcalc item stream input, the program requires the item, +coverage and fm input data files, which are Oasis abstract data objects +that describe an insurance or reinsurance programme. This data is picked +up from the following files relative to the working directory by +default;

-

For loss stream input from either gulcalc or fmcalc, the program requires only the four fm input data files,

+

For loss stream input from either gulcalc or fmcalc, the program +requires only the four fm input data files,

-

The location of the files can be changed by using the -p parameter followed by the path location relative to the present working directory. eg -p ri1

-
Calculation
-

fmcalc passes the loss samples, including the numerical integration mean, sidx -1, and impacted exposure, sidx -3, through a set of financial calculations which are defined by the input files. The special samples -2, -4 and -5 are ignored and dropped in the output. For more information about the calculation see Financial Module

+

The location of the files can be changed by using the -p parameter +followed by the path location relative to the present working directory. +eg -p ri1

+
Calculation
+

fmcalc passes the loss samples, including the numerical integration +mean, sidx -1, and impacted exposure, sidx -3, through a set of +financial calculations which are defined by the input files. The special +samples -2, -4 and -5 are ignored and dropped in the output. For more +information about the calculation see Financial Module

Return to top

summarycalc

-
-

The purpose of summarycalc is firstly to aggregate the samples of loss to a level of interest for reporting, thereby reducing the volume of data in the stream. This is a generic first step which precedes all of the downstream output calculations. Secondly, it unifies the formats of the gulcalc and fmcalc streams, so that they are transformed into an identical stream type for downstream outputs. Finally, it can generate up to 10 summary level outputs in one go, creating multiple output streams or files.

-

The output is similar to the gulcalc or fmcalc input which are losses are by sample index and by event, but the ground up or (re)insurance loss input losses are grouped to an abstract level represented by a summary_id. The relationship between the input identifier and the summary_id are defined in cross reference files called gulsummaryxref and fmsummaryxref.

-
Stream output
+
+

The purpose of summarycalc is firstly to aggregate the samples of +loss to a level of interest for reporting, thereby reducing the volume +of data in the stream. This is a generic first step which precedes all +of the downstream output calculations. Secondly, it unifies the formats +of the gulcalc and fmcalc streams, so that they are transformed into an +identical stream type for downstream outputs. Finally, it can generate +up to 10 summary level outputs in one go, creating multiple output +streams or files.

+

The output is similar to the gulcalc or fmcalc input which are losses +are by sample index and by event, but the ground up or (re)insurance +loss input losses are grouped to an abstract level represented by a +summary_id. The relationship between the input identifier and the +summary_id are defined in cross reference files called +gulsummaryxref and fmsummaryxref.

+
Stream output
- + - + - + - +
Byte 1Byte 1 Bytes 2-4DescriptionDescription
33 1summary streamsummary stream
-
Parameters
-

The input stream should be identified explicitly as -i input from gulcalc or -f input from fmcalc.

-

summarycalc supports up to 10 concurrent outputs. This is achieved by explictly directing each output to a named pipe, file, or to standard output.

-

For each output stream, the following tuple of parameters must be specified for at least one summary set;

+
Parameters
+

The input stream should be identified explicitly as -i input from +gulcalc or -f input from fmcalc.

+

summarycalc supports up to 10 concurrent outputs. This is achieved by +explictly directing each output to a named pipe, file, or to standard +output.

+

For each output stream, the following tuple of parameters must be +specified for at least one summary set;

For example the following parameter choices are valid;

-
$ summarycalc -i -1 - -'outputs results for summaryset 1 to standard output +
$ summarycalc -i -1 -                                       
+'outputs results for summaryset 1 to standard output
 $ summarycalc -i -1 summarycalc1.bin                        
-'outputs results for summaryset 1 to a file (or named pipe)
+'outputs results for summaryset 1 to a file (or named pipe)
 $ summarycalc -i -1 summarycalc1.bin -2 summarycalc2.bin    
-'outputs results for summaryset 1 and 2 to a file (or named pipe)
-
-

Note that the summaryset_id relates to a summaryset_id in the required input data file gulsummaryxref.bin or fmsummaryxref.bin for a gulcalc input stream or a fmcalc input stream, respectively, and represents a user specified summary reporting level. For example summaryset_id = 1 represents portfolio level, summaryset_id = 2 represents zipcode level and summaryset_id 3 represents site level.

-
Usage
-
$ [stdin component] | summarycalc [parameters] | [stdout component] +'outputs results for summaryset 1 and 2 to a file (or named pipe)
+

Note that the summaryset_id relates to a summaryset_id in the +required input data file gulsummaryxref.bin or +fmsummaryxref.bin for a gulcalc input stream or a +fmcalc input stream, respectively, and represents a user specified +summary reporting level. For example summaryset_id = 1 represents +portfolio level, summaryset_id = 2 represents zipcode level and +summaryset_id 3 represents site level.

+
Usage
+
$ [stdin component] | summarycalc [parameters] | [stdout component]
 $ [stdin component] | summarycalc [parameters]
-$ summarycalc [parameters] < [stdin].bin
-
-
Example
-
$ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | summarycalc -i -1 - | eltcalc > eltcalc.csv +$ summarycalc [parameters] < [stdin].bin
+
Example
+
$ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | summarycalc -i -1 - | eltcalc > eltcalc.csv
 $ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | summarycalc -i -1 gulsummarycalc.bin 
 $ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | fmcalc | summarycalc -f -1 fmsummarycalc.bin 
-$ summarycalc -f -1 fmsummarycalc.bin < fmcalc.bin
-
-
Internal data
-

The program requires the gulsummaryxref file for gulcalc input (-i option), or the fmsummaryxref file for fmcalc input (-f option). This data is picked up from the following files relative to the working directory;

+$ summarycalc -f -1 fmsummarycalc.bin < fmcalc.bin
+
Internal data
+

The program requires the gulsummaryxref file for gulcalc input (-i +option), or the fmsummaryxref file for fmcalc input (-f option). This +data is picked up from the following files relative to the working +directory;

-
Calculation
-

summarycalc takes either ground up loss from gulcalc or financial loss samples from fmcalc as input and aggregates them to a user-defined summary reporting level. The output is similar to the input, individual losses by sample index and by event, but the ground up or financial losses are summed to an abstract level represented by a summary_id. The relationship between the input identifier, item_id for gulcalc or output_id for fmcalc, and the summary_id are defined in the input files.

+
Calculation
+

summarycalc takes either ground up loss from gulcalc or financial +loss samples from fmcalc as input and aggregates them to a user-defined +summary reporting level. The output is similar to the input, individual +losses by sample index and by event, but the ground up or financial +losses are summed to an abstract level represented by a summary_id. The +relationship between the input identifier, item_id for gulcalc or +output_id for fmcalc, and the summary_id are defined in the input +files.

The special samples are computed as follows;

Return to top

-

Go to 4.2 Output Components section

-

Back to Contents

- - - +

Go to 4.2 Output Components +section

+

Back to Contents

+ + diff --git a/docs/html/DataConversionComponents.html b/docs/html/DataConversionComponents.html index b59b3115..bae0dd68 100644 --- a/docs/html/DataConversionComponents.html +++ b/docs/html/DataConversionComponents.html @@ -1,1913 +1,2456 @@ - - - -DataConversionComponents.md - - - - - - - - - - - - -

alt text

-

4.4 Data conversion components

-

The following components convert input data in csv format to the binary format required by the calculation components in the reference model;

+ + + + + + + DataConversionComponents + + + +

alt text

+

4.4 Data conversion components +

+

The following components convert input data in csv format to the +binary format required by the calculation components in the reference +model;

Static data

-

A reference intensity bin dictionary csv should also exist, although there is no conversion component for this file because it is not needed for calculation purposes.

+

A reference intensity bin dictionary csv +should also exist, although there is no conversion component for this +file because it is not needed for calculation purposes.

Input data

-

These components are intended to allow users to generate the required input binaries from csv independently of the original data store and technical environment. All that needs to be done is first generate the csv files from the data store (SQL Server database, etc).

-

The following components convert the binary input data required by the calculation components in the reference model into csv format;

+

These components are intended to allow users to generate the required +input binaries from csv independently of the original data store and +technical environment. All that needs to be done is first generate the +csv files from the data store (SQL Server database, etc).

+

The following components convert the binary input data required by +the calculation components in the reference model into csv format;

Static data

Input data

-

These components are provided for the convenience of viewing the data and debugging.

+

These components are provided for the convenience of viewing the data +and debugging.

Static data

+

+

aggregate vulnerability

+
+

The aggregate vulnerability file is required for the gulmc component. +It contains the conditional distributions of damage for each intensity +bin and for each vulnerability_id. This file must have the following +location and filename;

+ +
File format
+

The csv file should contain the following fields and include a header +row.

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
aggregate_vulnerability_idint4Oasis vulnerability_id45
vulnerability_idint4Oasis vulnerability_id45
+

If this file is present, the weights.bin or weights.csv file must +also be present. The data should not contain nulls.

+
aggregatevulnerabilitytobin
+
$ aggregatevulnerabilitytobin < aggregate_vulnerability.csv > aggregate_vulnerability.bin
+
aggregatevulnerabilitytocsv
+
$ aggregatevulnerabilitytocsv < aggregate_vulnerability.bin > aggregate_vulnerability.csv
+

Return to top

damage bin dictionary

-
-

The damage bin dictionary is a reference table in Oasis which defines how the effective damageability cdfs are discretized on a relative damage scale (normally between 0 and 1). It is required by getmodel and gulcalc and must have the following location and filename;

+
+

The damage bin dictionary is a reference table in Oasis which defines +how the effective damageability cdfs are discretized on a relative +damage scale (normally between 0 and 1). It is required by getmodel and +gulcalc and must have the following location and filename;

-
File format
-

The csv file should contain the following fields and include a header row.

+
File format
+

The csv file should contain the following fields and include a header +row.

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
bin_indexbin_index int 4Identifier of the damage bin1Identifier of the damage bin1
bin_frombin_from float 4Lower damage threshold for the bin0.01Lower damage threshold for the bin0.01
bin_tobin_to float 4Upper damage threshold for the bin0.02Upper damage threshold for the bin0.02
interpolationinterpolation float 4Interpolation damage value for the bin (usually the mid-point)0.015Interpolation damage value for the bin +(usually the mid-point)0.015
-

The interval_type field has been deprecated and will be filled with zeros in the binary file. It does not need to be included as the final column in the csv file:

+

The interval_type field has been deprecated and will be filled with +zeros in the binary file. It does not need to be included as the final +column in the csv file:

+++++++ - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
interval_typeinterval_type int 4Identifier of the interval type, e.g. closed, open (deprecated)0Identifier of the interval type, e.g. +closed, open (deprecated)0
-

The data should be ordered by bin_index ascending and not contain nulls. The bin_index should be a contiguous sequence of integers starting from 1.

+

The data should be ordered by bin_index ascending and not contain +nulls. The bin_index should be a contiguous sequence of integers +starting from 1.

damagebintobin
-
$ damagebintobin < damage_bin_dict.csv > damage_bin_dict.bin -
-

Validation checks on the damage bin dictionary csv file are conducted by default during conversion to binary format. These can be suppressed with the -N argument:

-
$ damagebintobin -N < damage_bin_dict.csv > damage_bin_dict.bin -
+
$ damagebintobin < damage_bin_dict.csv > damage_bin_dict.bin
+

Validation checks on the damage bin dictionary csv file are conducted +by default during conversion to binary format. These can be suppressed +with the -N argument:

+
$ damagebintobin -N < damage_bin_dict.csv > damage_bin_dict.bin
damagebintocsv
-
$ damagebintocsv < damage_bin_dict.bin > damage_bin_dict.csv -
-

The deprecated interval_type field can be sent to the output using the -i argument:

-
$ damagebintocsv -i < damage_bin_dict.bin > damage_bin_dict.csv -
+
$ damagebintocsv < damage_bin_dict.bin > damage_bin_dict.csv
+

The deprecated interval_type field can be sent to the output using +the -i argument:

+
$ damagebintocsv -i < damage_bin_dict.bin > damage_bin_dict.csv

Return to top

intensity bin dictionary

-

The intensity bin dictionary defines the meaning of the bins of the hazard intensity measure. The hazard intensity measure could be flood depth, windspeed, peak ground acceleration etc, depending on the type of peril. The range of hazard intensity values in the model is discretized into bins, each with a unique and contiguous bin_index listed in the intensity bin dictionary. The bin_index is used as a reference in the footprint file (field intensity_bin_index) to specify the hazard intensity for each event and areaperil.

-

This file is for reference only as it is not used in the calculation so there is no component to convert it to binary format.

-

The csv file should contain the following fields and include a header row.

+

The intensity bin dictionary defines the meaning of the bins of the +hazard intensity measure. The hazard intensity measure could be flood +depth, windspeed, peak ground acceleration etc, depending on the type of +peril. The range of hazard intensity values in the model is discretized +into bins, each with a unique and contiguous bin_index listed in the +intensity bin dictionary. The bin_index is used as a reference in the +footprint file (field intensity_bin_index) to specify the hazard +intensity for each event and areaperil.

+

This file is for reference only as it is not used in the calculation +so there is no component to convert it to binary format.

+

The csv file should contain the following fields and include a header +row.

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
bin_indexbin_index int 4Identifier of the intensity bin1Identifier of the intensity bin1
bin_frombin_from float 4Lower intensity threshold for the bin56Lower intensity threshold for the bin56
bin_tobin_to float 4Upper intensity threshold for the bin57Upper intensity threshold for the bin57
interpolationinterpolation float 4Mid-point intensity value for the bin0.015Mid-point intensity value for the bin0.015
interval_typeinterval_type int 4Identifier of the interval type, e.g. closed, open1Identifier of the interval type, e.g. +closed, open1
-

The data should be ordered by bin_index ascending and not contain nulls. The bin_index should be a contiguous sequence of integers starting from 1.

+

The data should be ordered by bin_index ascending and not contain +nulls. The bin_index should be a contiguous sequence of integers +starting from 1.

Return to top

footprint

-
-

The event footprint is required for the getmodel component, as well as an index file containing the starting positions of each event block. These must have the following location and filenames;

+
+

The event footprint is required for the getmodel component, as well +as an index file containing the starting positions of each event block. +These must have the following location and filenames;

-
File format
-

The csv file should contain the following fields and include a header row.

- +
File format
+

The csv file should contain the following fields and include a header +row.

+
+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
event_idevent_id int 4Oasis event_id1Oasis event_id1
areaperil_idareaperil_id int 4Oasis areaperil_id4545Oasis areaperil_id4545
intensity_bin_indexintensity_bin_index int 4Identifier of the intensity bin10Identifier of the intensity bin10
probprob float 4The probability mass for the intensity bin between 0 and 10.765The probability mass for the intensity bin +between 0 and 10.765
-

The data should be ordered by event_id, areaperil_id and not contain nulls.

+

The data should be ordered by event_id, areaperil_id and not contain +nulls.

footprinttobin
-
$ footprinttobin -i {number of intensity bins} < footprint.csv -
-

This command will create a binary file footprint.bin and an index file footprint.idx in the working directory. The number of intensity bins is the maximum value of intensity_bin_index.

-

Validation checks on the footprint csv file are conducted by default during conversion to binary format. These can be suppressed with the -N argument.

-
$ footprinttobin -i {number of intensity bins} -N < footprint.csv -
-

There is an additional parameter -n, which should be used when there is only one record per event_id and areaperil_id, with a single intensity_bin_index value and prob = 1. This is the special case 'no hazard intensity uncertainty'. In this case, the usage is as follows.

-
$ footprinttobin -i {number of intensity bins} -n < footprint.csv -
-

Both parameters -i and -n are held in the header of the footprint.bin and used in getmodel.

-

The output binary and index file names can be explicitly set using the -b and --x flags respectively:

-
$ footprinttobin -i {number of intensity bins} -b {output footprint binary file name} -x {output footprint index file name} < footprint.csv -
-

Both output binary and index file names must be given to use this option.

-

In the case of very large footprint files, it may be preferrable to compress the data as it is written to the binary file. Compression is performed using zlib by issuing the -z flag. If the -u flag is used in addition, the index file will include the uncompressed data size. It is recommended to use the -u flag to prevent any memory issues during decompression with getmodel or footprinttocsv:

-
$ footprinttobin -i {number of intensity bins} -z < footprint.csv -$ footprinttobin -i {number of intensity bins} -z -u < footprint.csv -
-

The value of the -u parameter is held in the same location as -n in the header of the footprint.bin file, left-shifted by 1.

+
$ footprinttobin -i {number of intensity bins} < footprint.csv
+

This command will create a binary file footprint.bin and an index +file footprint.idx in the working directory. The number of intensity +bins is the maximum value of intensity_bin_index.

+

Validation checks on the footprint csv file are conducted by default +during conversion to binary format. These can be suppressed with the -N +argument:

+
$ footprinttobin -i {number of intensity bins} -N < footprint.csv > footprint.bin
+

There is an additional parameter -n, which should be used when there +is only one record per event_id and areaperil_id, with a single +intensity_bin_index value and prob = 1. This is the special case 'no +hazard intensity uncertainty'. In this case, the usage is as +follows.

+
$ footprinttobin -i {number of intensity bins} -n < footprint.csv
+

Both parameters -i and -n are held in the header of the footprint.bin +and used in getmodel.

+

The output binary and index file names can be explicitly set using +the -b and -x flags respectively:

+
$ footprinttobin -i {number of intensity bins} -b {output footprint binary file name} -x {output footprint index file name} < footprint.csv
+

Both output binary and index file names must be given to use this +option.

+

In the case of very large footprint files, it may be preferrable to +compress the data as it is written to the binary file. Compression is +performed using zlib by issuing the -z +flag. If the -u flag is used in addition, the index file will include +the uncompressed data size. It is recommended to use the -u flag to +prevent any memory issues during decompression with getmodel or +footprinttocsv:

+
$ footprinttobin -i {number of intensity bins} -z < footprint.csv
+$ footprinttobin -i {number of intensity bins} -z -u < footprint.csv
+

The value of the -u parameter is held in the same location as -n in +the header of the footprint.bin file, left-shifted by 1.

footprinttocsv
-
$ footprinttocsv > footprint.csv -
-

footprinttocsv requires a binary file footprint.bin and an index file footprint.idx to be present in the working directory.

-

Input binary and index file names can be explicitly set using the -b and -x flags respectively:

-
$ footprinttocsv -b {input footprint binary file name} -x {input footprint index file name} > footprint.csv -
-

Both input binary and index file name must be given to use this option.

-

Footprint binary files that contain compressed data require the -z argument to be issued:

-
$ footprinttocsv -z > footprint.csv -
+
$ footprinttocsv > footprint.csv
+

footprinttocsv requires a binary file footprint.bin and an index file +footprint.idx to be present in the working directory.

+

Input binary and index file names can be explicitly set using the -b +and -x flags respectively:

+
$ footprinttocsv -b {input footprint binary file name} -x {input footprint index file name} > footprint.csv
+

Both input binary and index file name must be given to use this +option.

+

Footprint binary files that contain compressed data require the -z +argument to be issued:

+
$ footprinttocsv -z > footprint.csv
+

Return to top

+

+

Loss Factors

+
+

The lossfactors binary maps the event_id/amplification_id pairs with +post loss amplification factors, and is supplied by the model providers. +The first 4 bytes are preserved for future use and the data format is as +follows. It is required by Post Loss Amplification (PLA) workflow must +have the following location and filename;

+ +

File format

+

The csv file should contain the following fields and include a header +row.

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
event_idint4Event ID1
countint4Number of amplification IDs associated +with the event ID1
amplification_idint4Amplification ID1
factorfloat4The uplift factor1.01
+

All fields must not have null values. The csv file will not contain +the count, and the conversion tools will add/remove this count.

+
lossfactorstobin
+
$ lossfactorstobin < lossfactors.csv > lossfactors.bin
+
lossfactorstocsv
+
$ lossfactorstocsv < lossfactors.bin > lossfactors.csv

Return to top

Random numbers

-
-

A random number file may be provided for the gulcalc component as an option (using gulcalc -r parameter) The random number binary contains a list of random numbers used for ground up loss sampling in the kernel calculation. It must have the following location and filename;

+
+

A random number file may be provided for the gulcalc component as an +option (using gulcalc -r parameter) The random number binary contains a +list of random numbers used for ground up loss sampling in the kernel +calculation. It must have the following location and filename;

-

If the gulcalc -r parameter is not used, the random number binary is not required and random numbers are instead generated dynamically during the calculation, using the -R parameter to specify how many should be generated.

-

The random numbers can be imported from a csv file using the component randtobin.

-
File format
-

The csv file should contain a simple list of random numbers and include a header row.

+

If the gulcalc -r parameter is not used, the random number binary is +not required and random numbers are instead generated dynamically during +the calculation, using the -R parameter to specify how many should be +generated.

+

The random numbers can be imported from a csv file using the +component randtobin.

+
File format
+

The csv file should contain a simple list of random numbers and +include a header row.

+++++++ - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
randrand float 4Number between 0 and 10.75875Number between 0 and 10.75875
randtobin
-
$ randtobin < random.csv > random.bin -
+
$ randtobin < random.csv > random.bin
randtocsv
-

There are a few parameters available which allow the generation of a random number csv file as follows;

+

There are a few parameters available which allow the generation of a +random number csv file as follows;

-
$ randtocsv -r < random.bin > random.csv +
$ randtocsv -r < random.bin > random.csv
 $ randtocsv -g 1000000 > random.csv
-$ randtocsv -g 1000000 -S 1234 > random.csv
-
-

The -S {seed value} option produces repeatable random numbers, whereas usage of -g alone will generate a different set every time.

+$ randtocsv -g 1000000 -S 1234 > random.csv
+

The -S {seed value} option produces repeatable random numbers, +whereas usage of -g alone will generate a different set every time.

Return to top

vulnerability

-
-

The vulnerability file is required for the getmodel component. It contains the conditional distributions of damage for each intensity bin and for each vulnerability_id. This file must have the following location and filename;

+
+

The vulnerability file is required for the getmodel component. It +contains the conditional distributions of damage for each intensity bin +and for each vulnerability_id. This file must have the following +location and filename;

-
File format
-

The csv file should contain the following fields and include a header row.

+
File format
+

The csv file should contain the following fields and include a header +row.

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
vulnerability_idvulnerability_id int 4Oasis vulnerability_id45Oasis vulnerability_id45
intensity_bin_indexintensity_bin_index int 4Identifier of the hazard intensity bin10Identifier of the hazard intensity +bin10
damage_bin_indexdamage_bin_index int 4Identifier of the damage bin20Identifier of the damage bin20
probprob float 4The probability mass for the damage bin0.186The probability mass for the damage +bin0.186
-

The data should be ordered by vulnerability_id, intensity_bin_index and not contain nulls.

+

The data should be ordered by vulnerability_id, intensity_bin_index +and not contain nulls.

vulnerabilitytobin
-
$ vulnerabilitytobin -d {number of damage bins} < vulnerability.csv > vulnerability.bin -
-

The parameter -d number of damage bins is the maximum value of damage_bin_index. This is held in the header of vulnerability.bin and used by getmodel.

-

Validation checks on the vulnerability csv file are conducted by default during conversion to binary format. These can be suppressed with the -N argument:

-
$ vulnerabilitytobin -d {number of damage bins} -N < vulnerability.csv > vulnerability.bin -
-

In the case of very large vulnerability files, it may be preferrable to create an index file to improve performance. Issuing the -i flag creates vulnerability.bin and vulnerability.idx in the current working directory:

-
$ vulnerabilitytobin -d {number of damage bins} -i < vulnerability.csv -
-

Additionally, the data can be compressed as it is written to the binary file. Compression is performed with zlib by issuing the -z flag. This creates vulnerability.bin.z and vulnerability.idx.z in the current working directory:

-
$ vulnerabilitytobin -d {number of damage bins} -i < vulnerability.csv -
-

The getmodel component will look for the presence of index files in the following order to determine which algorithm to use to extract data from vulnerability.bin:

+
$ vulnerabilitytobin -d {number of damage bins} < vulnerability.csv > vulnerability.bin
+

The parameter -d number of damage bins is the maximum value of +damage_bin_index. This is held in the header of vulnerability.bin and +used by getmodel.

+

Validation checks on the vulnerability csv file are conducted by +default during conversion to binary format. These can be suppressed with +the -N argument:

+
$ vulnerabilitytobin -d {number of damage bins} -N < vulnerability.csv > vulnerability.bin
+

In the case of very large vulnerability files, it may be preferrable +to create an index file to improve performance. Issuing the -i flag +creates vulnerability.bin and vulnerability.idx in the current working +directory:

+
$ vulnerabilitytobin -d {number of damage bins} -i < vulnerability.csv
+

Additionally, the data can be compressed as it is written to the +binary file. Compression is performed with zlib by issuing the -z flag. This creates +vulnerability.bin.z and vulnerability.idx.z in the current working +directory:

+
$ vulnerabilitytobin -d {number of damage bins} -i < vulnerability.csv
+

The getmodel component will look for the presence of index files in +the following order to determine which algorithm to use to extract data +from vulnerability.bin:

  1. static/vulnerability.idx.z
  2. static/vulnerability.idx
vulnerabilitytocsv
-
$ vulnerabilitytocsv < vulnerability.bin > vulnerability.csv +
$ vulnerabilitytocsv < vulnerability.bin > vulnerability.csv
 $ vulnerabilitytocsv -i > vulnerability.csv
-$ vulnerabilitytocsv -z > vulnerability.csv
-
+$ vulnerabilitytocsv -z > vulnerability.csv +

Return to top

+

+

Weights

+
+

The vulnerability weights binary contains the the weighting of each +vulnerability function in all areaperil IDs. The data format is as +follows. It is required by gulmc with the aggregate_vulnerability file +and must have the following location and filename;

+ +

File format

+

The csv file should contain the following fields and include a header +row.

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
areaperil_idint4Areaperil ID1
vulnerability_idint4Vulnerability ID1
weightfloat4The weighting factor1.0
+

All fields must not have null values.

+
weightstobin
+
$ weightstobin < weights.csv > weights.bin
+
weightstocsv
+
$ weightstocsv < weights.bin > weights.csv

Return to top

Input data

+

+

Amplifications

+
+

The amplifications binary contains the list of item IDs mapped to +amplification IDs. The data format is as follows. It is required by Post +Loss Amplification (PLA) workflow must have the following location and +filename;

+ +

File format

+

The csv file should contain the following fields and include a header +row.

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
item_idint4Item ID1
amplification_idint4Amplification ID1
+

The item_id must start from 1 and must be contiguous and not have +null values. The binary file only contains the amplification IDs and +assumes the item_ids would start from 1 and are contiguous.

+
amplificationtobin
+
$ amplificationtobin < amplifications.csv > amplifications.bin
+
amplificationtocsv
+
$ amplificationtocsv < amplifications.bin > amplifications.csv
+

Return to top

Coverages

-
-

The coverages binary contains the list of coverages and the coverage TIVs. The data format is as follows. It is required by gulcalc and fmcalc and must have the following location and filename;

+
+

The coverages binary contains the list of coverages and the coverage +TIVs. The data format is as follows. It is required by gulcalc and +fmcalc and must have the following location and filename;

-

File format

-

The csv file should contain the following fields and include a header row.

+

File format

+

The csv file should contain the following fields and include a header +row.

+++++++ - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
coverage_idcoverage_id int 4Identifier of the coverage1Identifier of the coverage1
tivtiv float 4The total insured value of the coverage200000The total insured value of the +coverage200000
-

Coverage_id must be an ordered contiguous sequence of numbers starting at 1.

+

Coverage_id must be an ordered contiguous sequence of numbers +starting at 1.

coveragetobin
-
$ coveragetobin < coverages.csv > coverages.bin -
+
$ coveragetobin < coverages.csv > coverages.bin
coveragetocsv
-
$ coveragetocsv < coverages.bin > coverages.csv -
+
$ coveragetocsv < coverages.bin > coverages.csv
+

Return to top

+

+

ensemble

+
+

The ensemble file is used for ensemble modelling (multiple views) +which maps sample IDs to particular ensemble ID groups. It is an +optional file for use with AAL and LEC. It must have the following +location and filename;

+ +
File format
+

The csv file should contain a list of event_ids (integers) and +include a header.

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
sidxint4Sample ID1
ensemble_idint4Ensemble ID1
+
ensembletobin
+
$ ensembletobin < ensemble.csv > ensemble.bin
+
ensembletocsv
+
$ ensembletocsv < ensemble.bin > ensemble.csv

Return to top

events

-
-

One or more event binaries are required by eve. It must have the following location and filename;

+
+

One or more event binaries are required by eve. It must have the +following location and filename;

-
File format
-

The csv file should contain a list of event_ids (integers) and include a header.

+
File format
+

The csv file should contain a list of event_ids (integers) and +include a header.

+++++++ - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
event_idevent_id int 4Oasis event_id4545Oasis event_id4545
evetobin
-
$ evetobin < events.csv > events.bin -
+
$ evetobin < events.csv > events.bin
evetocsv
-
$ evetocsv < events.bin > events.csv -
+
$ evetocsv < events.bin > events.csv

Return to top

items

-
-

The items binary contains the list of exposure items for which ground up loss will be sampled in the kernel calculations. The data format is as follows. It is required by gulcalc and outputcalc and must have the following location and filename;

+
+

The items binary contains the list of exposure items for which ground +up loss will be sampled in the kernel calculations. The data format is +as follows. It is required by gulcalc and outputcalc and must have the +following location and filename;

-
File format
-

The csv file should contain the following fields and include a header row.

+
File format
+

The csv file should contain the following fields and include a header +row.

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
item_iditem_id int 4Identifier of the exposure item1Identifier of the exposure item1
coverage_idcoverage_id int 4Identifier of the coverage3Identifier of the coverage3
areaperil_idareaperil_id int 4Identifier of the locator and peril4545Identifier of the locator and peril4545
vulnerability_idvulnerability_id int 4Identifier of the vulnerability distribution645Identifier of the vulnerability +distribution645
group_idgroup_id int 4Identifier of the correlaton group3Identifier of the correlaton group3
-

The data should be ordered by areaperil_id, vulnerability_id ascending and not contain nulls. item_id must be a contiguous sequence of numbers starting from 1.

+

The data should be ordered by areaperil_id, vulnerability_id +ascending and not contain nulls. item_id must be a contiguous sequence +of numbers starting from 1.

itemtobin
-
$ itemtobin < items.csv > items.bin -
+
$ itemtobin < items.csv > items.bin
itemtocsv
-
$ itemtocsv < items.bin > items.csv -
+
$ itemtocsv < items.bin > items.csv

Return to top

gul summary xref

-
-

The gulsummaryxref binary is a cross reference file which determines how item or coverage losses from gulcalc output are summed together into at various summary levels in summarycalc. It is required by summarycalc and must have the following location and filename;

+
+

The gulsummaryxref binary is a cross reference file which determines +how item or coverage losses from gulcalc output are summed together into +at various summary levels in summarycalc. It is required by summarycalc +and must have the following location and filename;

-
File format
-

The csv file should contain the following fields and include a header row.

- +
File format
+

The csv file should contain the following fields and include a header +row.

+
+++++++ - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
item_id / coverage_iditem_id / coverage_id int 4Identifier of the item or coverage3Identifier of the item or coverage3
summary_idsummary_id int 4Identifier of the summary level grouping3Identifier of the summary level +grouping3
summaryset_idsummaryset_id int 4Identifier of the summary set1Identifier of the summary set1
-

One summary set consists of a common summaryset_id and each item_id being assigned a summary_id. An example is as follows.

+

One summary set consists of a common summaryset_id and each item_id +being assigned a summary_id. An example is as follows.

- + - + - + - + - + - + - + - + - + - + - + - + - + - +
item_iditem_id summary_idsummaryset_idsummaryset_id
11 111
22 111
33 111
44 211
55 211
66 211
-

This shows, for summaryset_id=1, items 1-3 being grouped into summary_id = 1 and items 4-6 being grouped into summary_id = 2. This could be an example of a 'site' level grouping, for example. The summary_ids should be held in a dictionary which contains the description of the ids to make meaning of the output results. For instance;

+

This shows, for summaryset_id=1, items 1-3 being grouped into +summary_id = 1 and items 4-6 being grouped into summary_id = 2. This +could be an example of a 'site' level grouping, for example. The +summary_ids should be held in a dictionary which contains the +description of the ids to make meaning of the output results. For +instance;

- + - + - + - + - + - +
summary_idsummary_id summaryset_idsummary_descsummary_desc
11 1site_435site_435
22 1site_958site_958

This cross reference information is not required in ktools.

-

Up to 10 summary sets may be provided in gulsummaryxref, depending on the required summary reporting levels for the analysis. Here is an example of the 'site' summary level with summaryset_id=1, plus an 'account' summary level with summaryset_id = 2. In summary set 2, the account summary level includes both sites because all items are assigned a summary_id of 1.

+

Up to 10 summary sets may be provided in gulsummaryxref, depending on +the required summary reporting levels for the analysis. Here is an +example of the 'site' summary level with summaryset_id=1, plus an +'account' summary level with summaryset_id = 2. In summary set 2, the +account summary level includes both sites because all items are assigned +a summary_id of 1.

- + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
item_iditem_id summary_idsummaryset_idsummaryset_id
11 111
22 111
33 111
44 211
55 211
66 211
11 122
22 122
33 122
44 122
55 122
66 122
gulsummaryxreftobin
-
$ gulsummaryxreftobin < gulsummaryxref.csv > gulsummaryxref.bin -
+
$ gulsummaryxreftobin < gulsummaryxref.csv > gulsummaryxref.bin
gulsummaryxreftocsv
-
$ gulsummaryxreftocsv < gulsummaryxref.bin > gulsummaryxref.csv -
+
$ gulsummaryxreftocsv < gulsummaryxref.bin > gulsummaryxref.csv

Return to top

fm programme

-
-

The fm programme binary file contains the level heirarchy and defines aggregations of losses required to perform a loss calculation, and is required for fmcalc only.

+
+

The fm programme binary file contains the level heirarchy and defines +aggregations of losses required to perform a loss calculation, and is +required for fmcalc only.

This must have the following location and filename;

-
File format
-

The csv file should contain the following fields and include a header row.

+
File format
+

The csv file should contain the following fields and include a header +row.

+++++++ - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
from_agg_idfrom_agg_id int 4Oasis Financial Module from_agg_id1Oasis Financial Module from_agg_id1
level_idlevel_id int 4Oasis Financial Module level_id1Oasis Financial Module level_id1
to_agg_idto_agg_id int 4Oasis Financial Module to_agg_id1Oasis Financial Module to_agg_id1
fmprogrammetobin
-
$ fmprogrammetobin < fm_programme.csv > fm_programme.bin -
+
$ fmprogrammetobin < fm_programme.csv > fm_programme.bin
fmprogrammetocsv
-
$ fmprogrammetocsv < fm_programme.bin > fm_programme.csv -
+
$ fmprogrammetocsv < fm_programme.bin > fm_programme.csv

Return to top

fm profile

-
-

The fmprofile binary file contains the list of calculation rules with profile values (policytc_ids) that appear in the policytc file. This is required for fmcalc only.

-

There are two versions of this file and either one or the other can be used at a time.

+
+

The fmprofile binary file contains the list of calculation rules with +profile values (policytc_ids) that appear in the policytc file. This is +required for fmcalc only.

+

There are two versions of this file and either one or the other can +be used at a time.

They must be in the following location with filename formats;

-
File format
-

The csv file should contain the following fields and include a header row.

+
File format
+

The csv file should contain the following fields and include a header +row.

fm_profile

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
policytc_idpolicytc_id int 4Primary key34Primary key34
calcrule_idcalcrule_id int 4The calculation rule that applies to the terms12The calculation rule that applies to the +terms12
deductible_1deductible_1 int 4First deductible0.03First deductible0.03
deductible_2deductible_2 float 4Second deductible50000Second deductible50000
deductible_3deductible_3 float 4Third deductible100000Third deductible100000
attachment_1attachment_1 float 4Attachment point, or excess1000000Attachment point, or excess1000000
limit_1limit_1 float 4Limit5000000Limit5000000
share_1share_1 float 4First proportional share0.8First proportional share0.8
share_2share_2 float 4Second proportional share0.25Second proportional share0.25
share_3share_3 float 4Third proportional share1Third proportional share1

fm_profile_step

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
policytc_idpolicytc_id int 4Primary key34Primary key34
calcrule_idcalcrule_id int 4The calculation rule that applies to the terms12The calculation rule that applies to the +terms12
deductible_1deductible_1 int 4First deductible0.03First deductible0.03
deductible_2deductible_2 float 4Second deductible50000Second deductible50000
deductible_3deductible_3 float 4Third deductible100000Third deductible100000
attachment_1attachment_1 float 4Attachment point, or excess1000000Attachment point, or excess1000000
limit_1limit_1 float 4First limit5000000First limit5000000
share_1share_1 float 4First proportional share0.8First proportional share0.8
share_2share_2 float 4Second proportional share0.25Second proportional share0.25
share_3share_3 float 4Third proportional share1Third proportional share1
step_idstep_id int 4Step number1Step number1
trigger_starttrigger_start float 4Start trigger for payout0.05Start trigger for payout0.05
trigger_endtrigger_end float 4End trigger for payout0.15End trigger for payout0.15
payout_startpayout_start float 4Start payout100Start payout100
payout_endpayout_end float 4End payout200End payout200
limit_2limit_2 float 4Second limit3000000Second limit3000000
scale_1scale_1 float 4Scaling (inflation) factor 10.03Scaling (inflation) factor 10.03
scale_2scale_2 float 4Scaling (inflation) factor 20.2Scaling (inflation) factor 20.2
fmprofiletobin
-
$ fmprofiletobin < fm_profile.csv > fm_profile.bin -$ fmprofiletobin -S < fm_profile_step.csv > fm_profile_step.bin -
+
$ fmprofiletobin < fm_profile.csv > fm_profile.bin
+$ fmprofiletobin -S < fm_profile_step.csv > fm_profile_step.bin
fmprofiletocsv
-
$ fmprofiletocsv < fm_profile.bin > fm_profile.csv -$ fmprofiletocsv -S < fm_profile_step.bin > fm_profile_step.csv -
+
$ fmprofiletocsv < fm_profile.bin > fm_profile.csv
+$ fmprofiletocsv -S < fm_profile_step.bin > fm_profile_step.csv

Return to top

fm policytc

-
-

The fm policytc binary file contains the cross reference between the aggregations of losses defined in the fm programme file at a particular level and the calculation rule that should be applied as defined in the fm profile file. This file is required for fmcalc only.

-

This must have the following location and filename;

+
+

The fm policytc binary file contains the cross reference between the +aggregations of losses defined in the fm programme file at a particular +level and the calculation rule that should be applied as defined in the +fm profile file. This file is required for fmcalc only.

+

This must have the following location and filename;

-
File format
-

The csv file should contain the following fields and include a header row.

+
File format
+

The csv file should contain the following fields and include a header +row.

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
layer_idlayer_id int 4Oasis Financial Module layer_id1Oasis Financial Module layer_id1
level_idlevel_id int 4Oasis Financial Module level_id1Oasis Financial Module level_id1
agg_idagg_id int 4Oasis Financial Module agg_id1Oasis Financial Module agg_id1
policytc_idpolicytc_id int 4Oasis Financial Module policytc_id1Oasis Financial Module policytc_id1
fmpolicytctobin
-
$ fmpolicytctobin < fm_policytc.csv > fm_policytc.bin -
+
$ fmpolicytctobin < fm_policytc.csv > fm_policytc.bin
fmpolicytctocsv
-
$ fmpolicytctocsv < fm_policytc.bin > fm_policytc.csv -
+
$ fmpolicytctocsv < fm_policytc.bin > fm_policytc.csv

Return to top

fm summary xref

-
-

The fm summary xref binary is a cross reference file which determines how losses from fmcalc output are summed together at various summary levels by summarycalc. It is required by summarycalc and must have the following location and filename;

+
+

The fm summary xref binary is a cross reference file which determines +how losses from fmcalc output are summed together at various summary +levels by summarycalc. It is required by summarycalc and must have the +following location and filename;

-
File format
-

The csv file should contain the following fields and include a header row.

+
File format
+

The csv file should contain the following fields and include a header +row.

+++++++ - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
output_idoutput_id int 4Identifier of the coverage3Identifier of the coverage3
summary_idsummary_id int 4Identifier of the summary level group for one or more output losses1Identifier of the summary level group for +one or more output losses1
summaryset_idsummaryset_id int 4Identifier of the summary set (0 to 9 inclusive)1Identifier of the summary set (0 to 9 +inclusive)1
-

One summary set consists of a common summaryset_id and each output_id being assigned a summary_id. An example is as follows.

+

One summary set consists of a common summaryset_id and each output_id +being assigned a summary_id. An example is as follows.

- + - + - + - + - + - +
output_idoutput_id summary_idsummaryset_idsummaryset_id
11 111
22 211
-

This shows, for summaryset_id=1, output_id=1 being assigned summary_id = 1 and output_id=2 being assigned summary_id = 2.

-

If the output_id represents a policy level loss output from fmcalc (the meaning of output_id is defined in the fm xref file) then no further grouping is performed by summarycalc and this is an example of a 'policy' summary level grouping.

-

Up to 10 summary sets may be provided in this file, depending on the required summary reporting levels for the analysis. Here is an example of the 'policy' summary level with summaryset_id=1, plus an 'account' summary level with summaryset_id = 2. In summary set 2, the 'account' summary level includes both policy's because both output_id's are assigned a summary_id of 1.

+

This shows, for summaryset_id=1, output_id=1 being assigned +summary_id = 1 and output_id=2 being assigned summary_id = 2.

+

If the output_id represents a policy level loss output from fmcalc +(the meaning of output_id is defined in the fm xref file) then no +further grouping is performed by summarycalc and this is an example of a +'policy' summary level grouping.

+

Up to 10 summary sets may be provided in this file, depending on the +required summary reporting levels for the analysis. Here is an example +of the 'policy' summary level with summaryset_id=1, plus an 'account' +summary level with summaryset_id = 2. In summary set 2, the 'account' +summary level includes both policy's because both output_id's are +assigned a summary_id of 1.

- + - + - + - + - + - + - + - + - + - +
output_idoutput_id summary_idsummaryset_idsummaryset_id
11 111
22 211
11 122
22 122
-

If a more detailed summary level than policy is required for insured losses, then the user should specify in the fm profile file to back-allocate fmcalc losses to items. Then the output_id represents back-allocated policy losses to item, and in the fmsummaryxref file these can be grouped into any summary level, such as site, zipcode, line of business or region, for example. The user needs to define output_id in the fm xref file, and group them together into meaningful summary levels in the fm summary xref file, hence these two files must be consistent with respect to the meaning of output_id.

+

If a more detailed summary level than policy is required for insured +losses, then the user should specify in the fm profile file to +back-allocate fmcalc losses to items. Then the output_id represents +back-allocated policy losses to item, and in the fmsummaryxref file +these can be grouped into any summary level, such as site, zipcode, line +of business or region, for example. The user needs to define output_id +in the fm xref file, and group them together into meaningful summary +levels in the fm summary xref file, hence these two files must be +consistent with respect to the meaning of output_id.

fmsummaryxreftobin
-
$ fmsummaryxreftobin < fmsummaryxref.csv > fmsummaryxref.bin -
+
$ fmsummaryxreftobin < fmsummaryxref.csv > fmsummaryxref.bin
fmsummaryxreftocsv
-
$ fmsummaryxreftocsv < fmsummaryxref.bin > fmsummaryxref.csv -
+
$ fmsummaryxreftocsv < fmsummaryxref.bin > fmsummaryxref.csv

Return to top

fm xref

-
-

The fmxref binary file contains cross reference data specifying the output_id in the fmcalc as a combination of agg_id and layer_id, and is required by fmcalc.

+
+

The fmxref binary file contains cross reference data specifying the +output_id in the fmcalc as a combination of agg_id and layer_id, and is +required by fmcalc.

This must be in the following location with filename format;

-
File format
-

The csv file should contain the following fields and include a header row.

+
File format
+

The csv file should contain the following fields and include a header +row.

+++++++ - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
output_idoutput_id int 4Identifier of the output group of losses1Identifier of the output group of +losses1
agg_idagg_id int 4Identifier of the agg_id to output1Identifier of the agg_id to output1
layer_idlayer_id int 4Identifier of the layer_id to output1Identifier of the layer_id to output1

The data should not contain any nulls.

-

The output_id represents the summary level at which losses are output from fmcalc, as specified by the user.

+

The output_id represents the summary level at which losses are output +from fmcalc, as specified by the user.

There are two cases;

-

For example, say there are two policy layers (with layer_ids=1 and 2) which applies to the sum of losses from 4 items (the summary level represented by agg_id=1). Without back-allocation, the policy summary level of losses can be represented as two output_id's as follows;

+

For example, say there are two policy layers (with layer_ids=1 and 2) +which applies to the sum of losses from 4 items (the summary level +represented by agg_id=1). Without back-allocation, the policy summary +level of losses can be represented as two output_id's as follows;

- + - + - + - + - + - +
output_idoutput_id agg_idlayer_idlayer_id
11 111
22 122
-

If the user wants to back-allocate policy losses to the items and output the losses by item and policy, then the item-policy summary level of losses would be represented by 8 output_id's, as follows;

+

If the user wants to back-allocate policy losses to the items and +output the losses by item and policy, then the item-policy summary level +of losses would be represented by 8 output_id's, as follows;

- + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
output_idoutput_id agg_idlayer_idlayer_id
11 111
22 211
33 311
44 411
55 122
66 222
77 322
88 422
-

The fm summary xref file must be consistent with respect to the meaning of output_id in the fmxref file.

+

The fm summary xref file must be consistent with respect to the +meaning of output_id in the fmxref file.

fmxreftobin
-
$ fmxreftobin < fm_xref.csv > fm_xref.bin -
+
$ fmxreftobin < fm_xref.csv > fm_xref.bin
fmxreftocsv
-
$ fmxreftocsv < fm_xref.bin > fm_xref.csv -
+
$ fmxreftocsv < fm_xref.bin > fm_xref.csv

Return to top

occurrence

-
-

The occurrence file is required for certain output components which, in the reference model, are leccalc, pltcalc and aalcalc. In general, some form of event occurence file is required for any output which involves the calculation of loss metrics over a period of time. The occurrence file assigns occurrences of the event_ids to numbered periods. A period can represent any length of time, such as a year, or 2 years for instance. The output metrics such as mean, standard deviation or loss exceedance probabilities are with respect to the chosen period length. Most commonly in catastrophe modelling, the period of interest is a year.

+
+

The occurrence file is required for certain output components which, +in the reference model, are leccalc, pltcalc and aalcalc. In general, +some form of event occurence file is required for any output which +involves the calculation of loss metrics over a period of time. The +occurrence file assigns occurrences of the event_ids to numbered +periods. A period can represent any length of time, such as a year, or 2 +years for instance. The output metrics such as mean, standard deviation +or loss exceedance probabilities are with respect to the chosen period +length. Most commonly in catastrophe modelling, the period of interest +is a year.

The occurrence file also includes date fields.

-
File format
-

The csv file should contain the following fields and include a header row.

+
File format
+

The csv file should contain the following fields and include a header +row.

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
event_idevent_id int 4The occurrence event_id45567The occurrence event_id45567
period_noperiod_no int 4A numbered period in which the event occurs56876A numbered period in which the event +occurs56876
occ_yearocc_year int 4the year number of the event occurrence56876the year number of the event +occurrence56876
occ_monthocc_month int 4the month of the event occurrence5the month of the event occurrence5
occ_dayocc_day int 4the day of the event occurrence16the day of the event occurrence16
-

The occurrence year in this example is a scenario numbered year, which cannot be expressed as a real date in a standard calendar.

-

In addition, the following fields are optional and should comprise the sixth and seventh column respectively:

+

The occurrence year in this example is a scenario numbered year, +which cannot be expressed as a real date in a standard calendar.

+

In addition, the following fields are optional and should comprise +the sixth and seventh column respectively:

+++++++ - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
occ_hourocc_hour int 4The hour of the event occurrence13The hour of the event occurrence13
occ_minuteocc_minute int 4The minute of the event occurrence52The minute of the event occurrence52
-

The date fields are converted to a single number through an algorithm for efficient storage in the binary file. The data type for this field is either an integer when the optional date fields are not included or a long long integer when the these date fields are included. This should not be confused with the deprecated occ_date_id field.

+

The date fields are converted to a single number through an algorithm +for efficient storage in the binary file. The data type for this field +is either an integer when the optional date fields are not included or a +long long integer when the these date fields are included. This should +not be confused with the deprecated occ_date_id field.

occurrencetobin
-

A required parameter is -P, the total number of periods of event occurrences. The total number of periods is held in the header of the binary file and used in output calculations.

-
$ occurrencetobin -P10000 < occurrence.csv > occurrence.bin -
-

If it is desirable to include the occ_hour and occ_minute fields in the binary file, the -H argument should be given. A flag to signify the presence of these fields is set in the header of the binary file, which is read by other kiools components. If these fields do not exist in the csv file, both are assigned the value of 0 when written to the binary file.

-
$ occurrencetobin -P10000 -H < occurrence.csv > occurrence.bin -
+

A required parameter is -P, the total number of periods of event +occurrences. The total number of periods is held in the header of the +binary file and used in output calculations.

+
$ occurrencetobin -P10000 < occurrence.csv > occurrence.bin
+

If it is desirable to include the occ_hour and occ_minute fields in +the binary file, the -H argument should be given. A flag to signify the +presence of these fields is set in the header of the binary file, which +is read by other kiools components. If these fields do not exist in the +csv file, both are assigned the value of 0 when written to the binary +file.

+
$ occurrencetobin -P10000 -H < occurrence.csv > occurrence.bin
occurrencetocsv
-
$ occurrencetocsv < occurrence.bin > occurrence.csv -
+
$ occurrencetocsv < occurrence.bin > occurrence.csv

Return to top

return period

-
-

The returnperiods binary file is a list of return periods that the user requires to be included in loss exceedance curve (leccalc) results.

+
+

The returnperiods binary file is a list of return periods that the +user requires to be included in loss exceedance curve (leccalc) +results.

This must be in the following location with filename format;

-
File format
-

The csv file should contain the following field and include a header.

+
File format
+

The csv file should contain the following field and include a +header.

+++++++ - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
return_periodreturn_period int 4Return period250Return period250
returnperiodtobin
-
$ returnperiodtobin < returnperiods.csv > returnperiods.bin -
+
$ returnperiodtobin < returnperiods.csv > returnperiods.bin
returnperiodtocsv
-
$ returnperiodtocsv < returnperiods.bin > returnperiods.csv -
+
$ returnperiodtocsv < returnperiods.bin > returnperiods.csv

Return to top

periods

-
-

The periods binary file is a list of all the periods that are in the model and is optional for weighting the periods in the calculation. The file is used in the calculation of the loss exceedance curve (leccalc) and aalcalc results.

+
+

The periods binary file is a list of all the periods that are in the +model and is optional for weighting the periods in the calculation. The +file is used in the calculation of the loss exceedance curve (leccalc) +and aalcalc results.

This must be in the following location with filename format;

-
File format
-

The csv file should contain the following field and include a header.

+
File format
+

The csv file should contain the following field and include a +header.

+++++++ - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
period_noperiod_no int 4A numbered period in which the event occurs4545A numbered period in which the event +occurs4545
weightweight int 4relative weight to P, the maximum period_no0.0003relative weight to P, the maximum +period_no0.0003
-

All periods must be present in this file (no gaps in period_no from 1 to P).

+

All periods must be present in this file (no gaps in period_no from 1 +to P).

periodstobin
-
$ periodstobin < periods.csv > periods.bin -
+
$ periodstobin < periods.csv > periods.bin
periodstocsv
-
$ periodstocsv < periods.bin > periods.csv -
+
$ periodstocsv < periods.bin > periods.csv
+

Return to top

+

+

Quantile

+
+

The quantile binary file contains a list of user specified quantile +floats. The data format is as follows. It is optionally used by the +Quantile Event/Period Loss tables and must have the following location +and filename;

+ +

File format

+

The csv file should contain the following fields and include a header +row.

+ +++++++ + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
quantilefloat4Quantile float0.1
+

All fields must not have null values.

+
quantiletobin
+
$ quantiletobin < quantile.csv > quantile.bin
+
quantiletocsv
+
$ quantiletocsv < quantile.bin > quantile.csv

Return to top

-

Go to 4.5 Stream conversion components section

-

Back to Contents

- - - +

Go to 4.5 Stream conversion +components section

+

Back to Contents

+ + diff --git a/docs/html/FinancialModule.html b/docs/html/FinancialModule.html index 3784310f..3608fb8f 100644 --- a/docs/html/FinancialModule.html +++ b/docs/html/FinancialModule.html @@ -1,591 +1,519 @@ - - - -FinancialModule.md - - - - - - - - - - - - -

alt text

-

5. Financial Module

-

The Oasis Financial Module is a data-driven process design for calculating the losses on (re)insurance contracts. It has an abstract design in order to cater for the many variations in contract structures and terms. The way Oasis works is to be fed data in order to execute calculations, so for the insurance calculations it needs to know the structure, parameters and calculation rules to be used. This data must be provided in the files used by the Oasis Financial Module:

+ + + + + + + FinancialModule + + + +

alt text

+

5. Financial Module +

+

The Oasis Financial Module is a data-driven process design for +calculating the losses on (re)insurance contracts. It has an abstract +design in order to cater for the many variations in contract structures +and terms. The way Oasis works is to be fed data in order to execute +calculations, so for the insurance calculations it needs to know the +structure, parameters and calculation rules to be used. This data must +be provided in the files used by the Oasis Financial Module:

-

This section explains the design of the Financial Module which has been implemented in the fmcalc component.

+

This section explains the design of the Financial Module which has +been implemented in the fmcalc component.

-

In addition, there is a separate github repository ktest which is an extended test suite for ktools and contains a library of financial module worked examples provided by Oasis Members with a full set of input and output files.

-

Note that other reference tables are referred to below that do not appear explicitly in the kernel as they are not directly required for calculation. It is expected that a front end system will hold all of the exposure and policy data and generate the above four input files required for the kernel calculation.

+

In addition, there is a separate github repository ktest which is an extended +test suite for ktools and contains a library of financial module worked +examples provided by Oasis Members with a full set of input and output +files.

+

Note that other reference tables are referred to below that do not +appear explicitly in the kernel as they are not directly required for +calculation. It is expected that a front end system will hold all of the +exposure and policy data and generate the above four input files +required for the kernel calculation.

Scope

-

The Financial Module outputs sample by sample losses by (re)insurance contract, or by item, which represents the individual coverage subject to economic loss from a particular peril. In the latter case, it is necessary to ‘back-allocate’ losses when they are calculated at a higher policy level. The Financial Module can output retained loss or ultimate net loss (UNL) perspectives as an option, and at any stage in the calculation.

-

The output contains anonymous keys representing the (re)insurance policy (agg_id and layer_id) at the chosen output level (output_id) and a loss value. Losses by sample number (idx) and event (event_id) are produced. To make sense of the output, this output must be cross-referenced with Oasis dictionaries which contain the meaningful business information.

-

The Financial Module does not support multi-currency calculations.

+

The Financial Module outputs sample by sample losses by (re)insurance +contract, or by item, which represents the individual coverage subject +to economic loss from a particular peril. In the latter case, it is +necessary to ‘back-allocate’ losses when they are calculated at a higher +policy level. The Financial Module can output retained loss or ultimate +net loss (UNL) perspectives as an option, and at any stage in the +calculation.

+

The output contains anonymous keys representing the (re)insurance +policy (agg_id and layer_id) at the chosen output level (output_id) and +a loss value. Losses by sample number (idx) and event (event_id) are +produced. To make sense of the output, this output must be +cross-referenced with Oasis dictionaries which contain the meaningful +business information.

+

The Financial Module does not support multi-currency +calculations.

Profiles

-

Profiles are used throughout the Oasis framework and are meta-data definitions with their associated data types and rules. Profiles are used in the Financial Module to perform the elements of financial calculations used to calculate losses to (re)insurance policies. For anything other than the most simple policy which has a blanket deductible and limit, say, a profile do not represent a policy structure on its own, but rather is to be used as a building block which can be combined with other building blocks to model a particular financial contract. In this way it is possible to model an unlimited range of structures with a limited number of profiles.

-

The FM Profiles form an extensible library of calculations defined within the fmcalc code that can be invoked by specifying a particular calcrule_id and providing the required data values such as deductible and limit, as described below.

+

Profiles are used throughout the Oasis framework and are meta-data +definitions with their associated data types and rules. Profiles are +used in the Financial Module to perform the elements of financial +calculations used to calculate losses to (re)insurance policies. For +anything other than the most simple policy which has a blanket +deductible and limit, say, a profile do not represent a policy structure +on its own, but rather is to be used as a building block which can be +combined with other building blocks to model a particular financial +contract. In this way it is possible to model an unlimited range of +structures with a limited number of profiles.

+

The FM Profiles form an extensible library of calculations defined +within the fmcalc code that can be invoked by specifying a particular +calcrule_id and providing the required data values such +as deductible and limit, as described below.

Supported Profiles

-

See Appendix B FM Profiles for more details.

+

See Appendix B FM Profiles for more +details.

Design

-

The Oasis Financial Module is a data-driven process design for calculating the losses on insurance policies. It is an abstract design in order to cater for the many variations and has four basic concepts:

+

The Oasis Financial Module is a data-driven process design for +calculating the losses on insurance policies. It is an abstract design +in order to cater for the many variations and has four basic +concepts:

    -
  1. A programme which defines which items are grouped together at which levels for the purpose of providing loss amounts to policy terms and conditions. The programme has a user-definable profile and dictionary called prog which holds optional reference details such as a description and account identifier. The prog table is not required for the calculation and therefore does not appear in the kernel input files.
  2. -
  3. A policytc profile which provides the parameters of the policy’s terms and conditions such as limit and deductible and calculation rules.
  4. -
  5. A policytc cross-reference file which associates a policy terms and conditions profile to each programme level aggregation.
  6. -
  7. A xref file which specifies how the output losses are summarized.
  8. +
  9. A programme which defines which +items are grouped together at which levels for the +purpose of providing loss amounts to policy terms and conditions. The +programme has a user-definable profile and dictionary called +prog which holds optional reference details such as a +description and account identifier. The prog table is not required for +the calculation and therefore does not appear in the kernel input +files.
  10. +
  11. A policytc profile which provides the parameters of +the policy’s terms and conditions such as limit and deductible and +calculation rules.
  12. +
  13. A policytc cross-reference file which associates a +policy terms and conditions profile to each programme level +aggregation.
  14. +
  15. A xref file which specifies how the output losses +are summarized.
-

The profile not only provides the fields to be used in calculating losses (such as limit and deductible) but also which mathematical calculation (calcrule_id) to apply.

+

The profile not only provides the fields to be used in calculating +losses (such as limit and deductible) but also which mathematical +calculation (calcrule_id) to apply.

Data requirements

-

The Financial Module brings together three elements in order to undertake a calculation:

+

The Financial Module brings together three elements in order to +undertake a calculation:

-

There are many ways an insurance loss can be calculated with many different terms and conditions. For instance, there may be deductibles applied to each element of coverage (e.g. a buildings damage deductible), some site-specific deductibles or limits, and some overall policy deductibles and limits and share. To undertake the calculation in the correct order and using the correct items (and their values) the structure and sequence of calculations must be defined. This is done in the programme file which defines a heirarchy of groups across a number of levels. Levels drive the sequence of calculation. A financial calculation is performed at successive levels, depending on the structure of policy terms and conditions. For example there might be 3 levels representing coverage, site and policy terms and conditions.

-

Figure 1. Example 3-level programme hierarchy

-

alt text

-

Groups are defined within levels and they represent aggregations of losses on which to perform the financial calculations. The grouping fields are called from_agg_id and to_agg_id which represent a grouping of losses at the previous level and the present level of the hierarchy, respectively.

-

Each level calculation applies to the to_agg_id groupings in the heirarchy. There is no calculation applied to the from_agg_id groupings at level 1 - these ids directly correspond to the ids in the loss input.

-

Figure 2. Example level 2 grouping

-

alt text

+

There are many ways an insurance loss can be calculated with many +different terms and conditions. For instance, there may be deductibles +applied to each element of coverage (e.g. a buildings damage +deductible), some site-specific deductibles or limits, and some overall +policy deductibles and limits and share. To undertake the calculation in +the correct order and using the correct items (and their values) the +structure and sequence of calculations must be defined. This is done in +the programme file which defines a heirarchy of groups +across a number of levels. Levels drive the sequence of +calculation. A financial calculation is performed at successive levels, +depending on the structure of policy terms and conditions. For example +there might be 3 levels representing coverage, site and policy terms and +conditions.

+

Figure 1. Example +3-level programme hierarchy

+

+

Groups are defined within levels and they represent aggregations of +losses on which to perform the financial calculations. The grouping +fields are called from_agg_id and to_agg_id which represent a grouping +of losses at the previous level and the present level of the hierarchy, +respectively.

+

Each level calculation applies to the to_agg_id groupings in the +heirarchy. There is no calculation applied to the from_agg_id groupings +at level 1 - these ids directly correspond to the ids in the loss +input.

+

Figure 2. Example level 2 +grouping

+

Loss values

-

The initial input is the ground-up loss (GUL) table, generally coming from the main Oasis calculation of ground-up losses. Here is an example, for a two events and 1 sample (idx=1):

+

The initial input is the ground-up loss (GUL) table, generally coming +from the main Oasis calculation of ground-up losses. Here is an example, +for a two events and 1 sample (idx=1):

- + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - +
event_idevent_id item_id sidxlossloss
11 1 1100,000100,000
11 2 110,00010,000
11 3 12,5002,500
11 4 1400400
22 1 190,00090,000
22 2 115,00015,000
22 3 13,0003,000
22 4 1500500
-

The values represent a single ground-up loss sample for items belonging to an account. We use “programme” rather than "account" as it is more general characteristic of a client’s exposure protection needs and allows a client to have multiple programmes active for a given period. -The linkage between account and programme can be provided by a user defined prog dictionary, for example;

+

The values represent a single ground-up loss sample for items +belonging to an account. We use “programme” rather than "account" as it +is more general characteristic of a client’s exposure protection needs +and allows a client to have multiple programmes active for a given +period. The linkage between account and programme can be provided by a +user defined prog dictionary, for example;

- + - + - + - +
prog_idprog_id account_idprog_nameprog_name
11 1ABC Insurance Co. 2016 renewalABC Insurance Co. 2016 renewal
-

Items 1-4 represent Structure, Other Structure, Contents and Time Element coverage ground up losses for a single property, respectively, and this example is a simple residential policy with combined property coverage terms. For this policy type, the Structure, Other Structure and Contents losses are aggregated, and a deductible and limit is applied to the total. A separate set of terms, again a simple deductible and limit, is applied to the “Time Element” coverage which, for residential policies, generally means costs for temporary accommodation. The total insured loss is the sum of the output from the combined property terms and the time element terms.

+

Items 1-4 represent Structure, Other Structure, Contents and Time +Element coverage ground up losses for a single property, respectively, +and this example is a simple residential policy with combined property +coverage terms. For this policy type, the Structure, Other Structure and +Contents losses are aggregated, and a deductible and limit is applied to +the total. A separate set of terms, again a simple deductible and limit, +is applied to the “Time Element” coverage which, for residential +policies, generally means costs for temporary accommodation. The total +insured loss is the sum of the output from the combined property terms +and the time element terms.

Programme

-

The actual items falling into the programme are specified in the programme table together with the aggregation groupings that go into a given level calculation:

+

The actual items falling into the programme are specified in the +programme table together with the aggregation groupings +that go into a given level calculation:

- + - + - + - + - + - + - + - + - + - + - + - + - + - +
from_agg_idfrom_agg_id level_idto_agg_idto_agg_id
11 111
22 111
33 111
44 122
11 211
22 211
-

Note that from_agg_id for level_id=1 is equal to the item_id in the input loss table (but in theory from_agg_id could represent a higher level of grouping, if required).

-

In level 1, items 1, 2 and 3 all have to_agg_id =1 so losses will be summed together before applying the combined deductible and limit, but item 4 (time element) will be treated separately (not aggregated) as it has to_agg_id = 2. For level 2 we have all 4 items losses (now represented by two groups from_agg_id =1 and 2 from the previous level) aggregated together as they have the same to_agg_id = 1.

+

Note that from_agg_id for level_id=1 is equal to the item_id in the +input loss table (but in theory from_agg_id could represent a higher +level of grouping, if required).

+

In level 1, items 1, 2 and 3 all have to_agg_id =1 so losses will be +summed together before applying the combined deductible and limit, but +item 4 (time element) will be treated separately (not aggregated) as it +has to_agg_id = 2. For level 2 we have all 4 items losses (now +represented by two groups from_agg_id =1 and 2 from the previous level) +aggregated together as they have the same to_agg_id = 1.

Profile

-

Next we have the profile description table, which list the profiles representing general policy types. Our example is represented by two general profiles which specify the input fields and mathematical operations to perform. In this example, the profile for the combined coverages and time is the same (albeit with different values) and requires a limit, a deductible, and an associated calculation rule, whereas the profile for the policy requires a limit, attachment, and share, and an associated calculation rule.

+

Next we have the profile description table, which list the profiles +representing general policy types. Our example is represented by two +general profiles which specify the input fields and mathematical +operations to perform. In this example, the profile for the combined +coverages and time is the same (albeit with different values) and +requires a limit, a deductible, and an associated calculation rule, +whereas the profile for the policy requires a limit, attachment, and +share, and an associated calculation rule.

- - + + - - + + - - + +
Profile descriptioncalcrule_idProfile descriptioncalcrule_id
deductible and limit1deductible and limit1
deductible and/or attachment, limit and share2deductible and/or attachment, limit and +share2
-

There is a “profile value” table for each profile containing the applicable policy terms, each identified by a policytc_id. The table below shows the list of policy terms for calcrule_id 1.

+

There is a “profile value” table for each profile containing the +applicable policy terms, each identified by a policytc_id. The table +below shows the list of policy terms for calcrule_id 1.

- + - + - +
policytc_idpolicytc_id deductible1 limit1
11 1,000 1,000,000
22 2,000 18,000
-

And next, for calcrule_id 2, the values for the overall policy attachment, limit and share

+

And next, for calcrule_id 2, the values for the overall policy +attachment, limit and share

- + @@ -594,7 +522,7 @@

Profile

- + @@ -602,13 +530,28 @@

Profile

policytc_idpolicytc_id deductible1 attachment1 limit1
33 0 1,000 1,000,000
-

In practice, all profile values are stored in a single flattened format which contains all supported profile fields (see fm profile in 4.3 Data Conversion Components), but conceptually they belong in separate profile value tables.

+

In practice, all profile values are stored in a single flattened +format which contains all supported profile fields (see fm profile in 4.3 Data Conversion Components), +but conceptually they belong in separate profile value tables.

The flattened file is;

fm_profile

- +
++++++++++++ - + @@ -617,12 +560,12 @@

Profile

- + - + @@ -631,10 +574,10 @@

Profile

- + - + @@ -643,10 +586,10 @@

Profile

- + - + @@ -655,147 +598,201 @@

Profile

- +
policytc_idpolicytc_id calcrule_id deductible1 deductible2 limit1 share1 share2share3share3
11 1 1,000 0 1,000,000 0 000
11 1 2,000 0 18,000 0 000
11 2 0 0 1,000,000 0.1 000
-

For any given profile we have one standard rule calcrule_id, being the mathematical function used to calculate the losses from the given profile’s fields. More information about the functions can be found in FM Profiles.

+

For any given profile we have one standard rule +calcrule_id, being the mathematical function used to +calculate the losses from the given profile’s fields. More information +about the functions can be found in FM +Profiles.

Policytc

-

The policytc table specifies the (re)insurance contracts (this is a combination of agg_id and layer_id) and the separate terms and conditions which will be applied to each layer_id/agg_id for a given level. In our example, we have a limit and deductible with the same value applicable to the combination of the first three items, a limit and deductible for the fourth item (time) in level 1, and then a limit, attachment, and share applicable at level 2 covering all items. We’d represent this in terms of the distinct agg_ids as follows:

+

The policytc table specifies the (re)insurance +contracts (this is a combination of agg_id and layer_id) and the +separate terms and conditions which will be applied to each +layer_id/agg_id for a given level. In our example, we have a limit and +deductible with the same value applicable to the combination of the +first three items, a limit and deductible for the fourth item (time) in +level 1, and then a limit, attachment, and share applicable at level 2 +covering all items. We’d represent this in terms of the distinct agg_ids +as follows:

- + - + - + - + - + - + - + - +
layer_idlayer_id level_id agg_idpolicytc_idpolicytc_id
11 1 111
11 1 222
11 2 133

In words, the data in the table mean;

At Level 1;

-

Apply policytc_id (terms and conditions) 1 to (the sum of losses represented by) agg_id 1

+

Apply policytc_id (terms and conditions) 1 to (the sum of losses +represented by) agg_id 1

Apply policytc_id 2 to agg_id 2

Then at level 2;

Apply policytc_id 3 to agg_id 1

-

Levels are processed in ascending order and the calculated losses from a previous level are summed according to the groupings defined in the programme table which become the input losses to the next level.

+

Levels are processed in ascending order and the calculated losses +from a previous level are summed according to the groupings defined in +the programme table which become the input losses to the next level.

Layers

-

Layers can be used to model multiple sets of terms and conditions applied to the same losses, such as excess policies. For the lower level calculations and in the general case where there is a single contract, layer_id should be set to 1. For a given level_id and agg_id, multiple layers can be defined by setting layer_id =1,2,3 etc, and assigning a different calculation policytc_id to each.

-

Figure 3. Example of multiple layers

-

alt text

-

For this example at level 3, the policytc data might look as follows;

+

Layers can be used to model multiple sets of terms and conditions +applied to the same losses, such as excess policies. For the lower level +calculations and in the general case where there is a single contract, +layer_id should be set to 1. For a given level_id and agg_id, multiple +layers can be defined by setting layer_id =1,2,3 etc, and assigning a +different calculation policytc_id to each.

+

Figure 3. Example of +multiple layers

+

+

For this example at level 3, the policytc data might look as +follows;

- + - + - + - + - + - +
layer_idlayer_id level_id agg_idpolicytc_idpolicytc_id
11 3 12222
22 3 12323

Output and back-allocation

-

Losses are output by event, output_id and sample. The table looks like this;

+

Losses are output by event, output_id and sample. The table looks +like this;

- + - + - + - + - + - +
event_idevent_id output_id sidxlossloss
11 1 1455.24455.24
22 1 1345.6345.6
-

The output_id is specified by the user in the xref table, and is a unique combination of agg_id and layer_id. For instance;

+

The output_id is specified by the user in the xref +table, and is a unique combination of agg_id and layer_id. For +instance;

- + - + - + - + - + - +
output_idoutput_id agg_idlayer_idlayer_id
11 111
22 122
-

The output_id must be specified consistently with the back-allocation rule. Losses can either output at the contract level or back-allocated to the lowest level, which is item_id, using one of three command line options. There are three meaningful values here – don’t allocate (0) used typically for all levels where a breakdown of losses is not required in output, allocate back to items (1) in proportion to the input (ground up) losses, or allocate back to items (2) in proportion to the losses from the prior level calculation.

-
$ fmcalc -a0 # Losses are output at the contract level and not back-allocated +

The output_id must be specified consistently with the back-allocation +rule. Losses can either output at the contract level or back-allocated +to the lowest level, which is item_id, using one of three command line +options. There are three meaningful values here – don’t allocate (0) +used typically for all levels where a breakdown of losses is not +required in output, allocate back to items (1) in proportion to the +input (ground up) losses, or allocate back to items (2) in proportion to +the losses from the prior level calculation.

+
$ fmcalc -a0 # Losses are output at the contract level and not back-allocated
 $ fmcalc -a1 # Losses are back-allocated to items on the basis of the input losses (e.g. ground up loss)
-$ fmcalc -a2 # Losses are back-allocated to items on the basis of the prior level losses
-
-

The rules for specifying the output_ids in the xref table are as follows;

+$ fmcalc -a2 # Losses are back-allocated to items on the basis of the prior level losses
+

The rules for specifying the output_ids in the xref table are as +follows;

-

To make sense of this, if there is more than one output at the contract level, then each one must be back-allocated to all of the items, with each individual loss represented by a unique output_id.

-

To avoid unnecessary computation, it is recommended not to back-allocate unless losses are required to be reported at a more detailed level than the contract level (site or zip code, for example). In this case, losses are re-aggregated up from item level (represented by output_id in fmcalc output) in summarycalc, using the fmsummaryxref table.

+

To make sense of this, if there is more than one output at the +contract level, then each one must be back-allocated to all of the +items, with each individual loss represented by a unique output_id.

+

To avoid unnecessary computation, it is recommended not to +back-allocate unless losses are required to be reported at a more +detailed level than the contract level (site or zip code, for example). +In this case, losses are re-aggregated up from item level (represented +by output_id in fmcalc output) in summarycalc, using the fmsummaryxref +table.

Reinsurance

-

The first run of fmcalc is designed to calculate the primary or direct insurance losses from the ground up losses of an exposure portfolio. fmcalc has been designed to be recursive, so that the 'gross' losses from the first run can be streamed back in to second and subsequent runs of fmcalc, each time with a different set of input files representing reinsurance contracts, and can output either the reinsurance gross loss, or net loss. There are two modes of output;

+

The first run of fmcalc is designed to calculate the primary or +direct insurance losses from the ground up losses of an exposure +portfolio. fmcalc has been designed to be recursive, so that the 'gross' +losses from the first run can be streamed back in to second and +subsequent runs of fmcalc, each time with a different set of input files +representing reinsurance contracts, and can output either the +reinsurance gross loss, or net loss. There are two modes of output;

-

net loss is output when the command line parameter -n is used, otherwise output loss is gross by default.

+

net loss is output when the command line parameter -n is used, +otherwise output loss is gross by default.

Supported reinsurance types

The types of reinsurance supported by the Financial Module are;

Required files

-

Second and subsequent runs of fmcalc require the same four fm files fm_programme, fm_policytc, fm_profile, and fm_xref.

-

This time, the hierarchy specified in fm_programme must be consistent with the range of output_ids from the incoming stream of losses, as specified in the fm_xref file from the previous iteration. Specifically, this means the range of values in from_agg_id at level 1 must match the range of values in output_id.

+

Second and subsequent runs of fmcalc require the same four fm files +fm_programme, fm_policytc, fm_profile, and fm_xref.

+

This time, the hierarchy specified in fm_programme must be consistent +with the range of output_ids from the incoming stream of losses, as +specified in the fm_xref file from the previous iteration. Specifically, +this means the range of values in from_agg_id at level 1 must match the +range of values in output_id.

For example;

fm_xref (iteration 1)

- + - + - + - + - + - +
output_idoutput_id agg_idlayer_idlayer_id
11 111
22 122
@@ -835,84 +837,97 @@

Required files

- + - + - + - + - + - + - + - + - + - +
from_agg_idfrom_agg_id level_idto_agg_idto_agg_id
11 111
22 122
11 211
22 211
-

The abstraction of from_agg_id at level 1 from item_id means that losses needn't be back-allocated to item_id after every iteration of fmcalc. In fact, performance will be improved when back-allocation is minimised.

-

Example - Quota share reinsurance

-

Using the two layer example from above, here's an example of the fm files for a simple quota share treaty with 50% ceded and 90% placed covering both policy layers.

-

The command to run the direct insurance followed by reinsurance might look like this;

-
$ fmcalc -p direct < guls.bin | fmcalc -p ri1 -n > ri1_net.bin -
-

In this command, ground up losses are being streamed into fmcalc to calculate the insurance losses, which are streamed into fmcalc again to calculate the reinsurance net loss. The direct insurance fm files would be located in the folder 'direct' and the reinsurance fm files in the folder 'ri1'. The -n flag in the second call of fmcalc results in net losses being output to the file 'ri1_net.bin'. These are the losses to the insurer net of recoveries from the quota share treaty.

+

The abstraction of from_agg_id at level 1 from item_id means that +losses needn't be back-allocated to item_id after every iteration of +fmcalc. In fact, performance will be improved when back-allocation is +minimised.

+

Example - Quota share +reinsurance

+

Using the two layer example from above, here's an example of the fm +files for a simple quota share treaty with 50% ceded and 90% placed +covering both policy layers.

+

The command to run the direct insurance followed by reinsurance might +look like this;

+
$ fmcalc -p direct < guls.bin | fmcalc -p ri1 -n > ri1_net.bin
+

In this command, ground up losses are being streamed into fmcalc to +calculate the insurance losses, which are streamed into fmcalc again to +calculate the reinsurance net loss. The direct insurance fm files would +be located in the folder 'direct' and the reinsurance fm files in the +folder 'ri1'. The -n flag in the second call of fmcalc results in net +losses being output to the file 'ri1_net.bin'. These are the losses to +the insurer net of recoveries from the quota share treaty.

The fm_xref file from the direct insurance (first) iteration is

fm_xref

- + - + - + - + - + - +
output_idoutput_id agg_idlayer_idlayer_id
11 111
22 122
-

The fm files for the reinsurance (second) iteration would be as follows;

+

The fm files for the reinsurance (second) iteration would be as +follows;

fm_programme

- + - + - + - + - + - +
from_agg_idfrom_agg_id level_idto_agg_idto_agg_id
11 111
22 111
@@ -920,26 +935,38 @@

Example - Quota share reinsurance

-layer_id +layer_id level_id agg_id -policytc_id +policytc_id -1 +1 1 1 -1 +1

fm_profile

- +
++++++++++++ - + @@ -948,12 +975,12 @@

Example - Quota share reinsurance

limit1 - + - + @@ -962,7 +989,7 @@

Example - Quota share reinsurance

0 - +
policytc_idpolicytc_id calcrule_id deductible1 deductible2 share1 share2share3share3
11 25 0 0 0.5 0.911
@@ -970,27 +997,32 @@

Example - Quota share reinsurance

-output_id +output_id agg_id -layer_id +layer_id -1 +1 1 -1 +1

Inuring priority

-

The Financial Module can support unlimited inuring priority levels for reinsurance. Each set of contracts with equal inuring priority would be calculated in one iteration. The net losses from the first inuring priority are streamed into the second inuring priority calculation, and so on.

-

Where there are multiple contracts with equal inuring priority, these are implemented as layers with a single iteration.

+

The Financial Module can support unlimited inuring priority levels +for reinsurance. Each set of contracts with equal inuring priority would +be calculated in one iteration. The net losses from the first inuring +priority are streamed into the second inuring priority calculation, and +so on.

+

Where there are multiple contracts with equal inuring priority, these +are implemented as layers with a single iteration.

The net calculation for iterations with multiple layers is;

-

net loss = max(0, input loss - layer1 loss - layer2 loss - ... - layer n loss)

+

net loss = max(0, input loss - layer1 loss - layer2 loss - ... - +layer n loss)

Return to top

-

Go to 6. Workflows

-

Back to Contents

- - - +

Go to 6. Workflows

+

Back to Contents

+ + diff --git a/docs/html/Introduction.html b/docs/html/Introduction.html index a7a2d9dd..0682d97f 100644 --- a/docs/html/Introduction.html +++ b/docs/html/Introduction.html @@ -1,400 +1,333 @@ - - - -Introduction.md - - - - - - - - - - - - -

alt text

+ + + + + + + Introduction + + + +

alt text

1. Introduction

-

The in-memory solution for the Oasis Kernel is called the kernel tools or “ktools”. ktools is an independent “specification” of a set of processes which means that it defines the processing architecture and data structures. The framework is implemented as a set of components called the “reference model” which can then be adapted for particular model or business needs.

-

The code can be compiled in Linux, POSIX-compliant Windows and native Windows. The installation instructions can be found in README.md.

+

The in-memory solution for the Oasis Kernel is called the kernel +tools or “ktools”. ktools is an independent “specification” of a set of +processes which means that it defines the processing architecture and +data structures. The framework is implemented as a set of components +called the “reference model” which can then be adapted for particular +model or business needs.

+

The code can be compiled in Linux, POSIX-compliant Windows and native +Windows. The installation instructions can be found in README.html.

Background

-

The Kernel performs the core Oasis calculations of computing effective damageability distributions, Monte-Carlo sampling of ground up loss, the financial module calculations, which apply insurance policy terms and conditions to the sampled losses, and finally some common catastrophe model outputs.

-

The early releases of Oasis used a SQL-compliant database to perform all calculations. Release 1.3 included the first “in-memory” version of the Oasis Kernel written in C++ and C to provide streamed calculation at high computational performance, as an alternative to the database calculation. The scope of the in-memory calculation was for the most intensive Kernel calculations of ground up loss sampling and the financial module. This in-memory variant was first delivered as a stand-alone toolkit "ktools" with R1.4.

-

With R1.5, a Linux-based in-memory calculation back-end was released, using the reference model components of ktools. The range of functionality of ktools was still limited to ground up loss sampling, the financial module and single output workflows, with effective damage distributions and output calculations still being performed in a SQL-compliant database.

-

In 2016 the functionality of ktools was extended to include the full range of Kernel calculations, including effective damageability distribution calculations and a wider range of financial module and output calculations. The data stream structures and input data formats were also substantially revised to handle multi-peril models, user-definable summary levels for outputs, and multiple output workflows.

-

In 2018 the Financial Module was extended to perform net loss calculations for per occurrence forms of reinsurance, including facultative reinsurance, quota share, surplus share, per risk and catastrophe excess of loss treaties.

+

The Kernel performs the core Oasis calculations of computing +effective damageability distributions, Monte-Carlo sampling of ground up +loss, the financial module calculations, which apply insurance policy +terms and conditions to the sampled losses, and finally some common +catastrophe model outputs.

+

The early releases of Oasis used a SQL-compliant database to perform +all calculations. Release 1.3 included the first “in-memory” version of +the Oasis Kernel written in C++ and C to provide streamed calculation at +high computational performance, as an alternative to the database +calculation. The scope of the in-memory calculation was for the most +intensive Kernel calculations of ground up loss sampling and the +financial module. This in-memory variant was first delivered as a +stand-alone toolkit "ktools" with R1.4.

+

With R1.5, a Linux-based in-memory calculation back-end was released, +using the reference model components of ktools. The range of +functionality of ktools was still limited to ground up loss sampling, +the financial module and single output workflows, with effective damage +distributions and output calculations still being performed in a +SQL-compliant database.

+

In 2016 the functionality of ktools was extended to include the full +range of Kernel calculations, including effective damageability +distribution calculations and a wider range of financial module and +output calculations. The data stream structures and input data formats +were also substantially revised to handle multi-peril models, +user-definable summary levels for outputs, and multiple output +workflows.

+

In 2018 the Financial Module was extended to perform net loss +calculations for per occurrence forms of reinsurance, including +facultative reinsurance, quota share, surplus share, per risk and +catastrophe excess of loss treaties.

Architecture

-

The Kernel is provided as a toolkit of components (“ktools”) which can be invoked at the command line. Each component is a separately compiled executable with a binary data stream of inputs and/or outputs.

-

The principle is to stream data through the calculations in memory, starting with generating the damage distributions and ending with calculating the user's required result, before writing the output to disk. This is done on an event-by-event basis, which means at any one time the compute server only has to hold the model data for a single event in its memory, per process. The user can run the calculation across multiple processes in parallel, specifiying the analysis workfkow and number of processes in a script file appropriate to the operating system.

+

The Kernel is provided as a toolkit of components (“ktools”) which +can be invoked at the command line. Each component is a separately +compiled executable with a binary data stream of inputs and/or +outputs.

+

The principle is to stream data through the calculations in memory, +starting with generating the damage distributions and ending with +calculating the user's required result, before writing the output to +disk. This is done on an event-by-event basis, which means at any one +time the compute server only has to hold the model data for a single +event in its memory, per process. The user can run the calculation +across multiple processes in parallel, specifiying the analysis workfkow +and number of processes in a script file appropriate to the operating +system.

Language

-

The components can be written in any language as long as the data structures of the binary streams are adhered to. The current set of components have been written in POSIX-compliant C++. This means that they can be compiled in Linux and Windows using the latest GNU compiler toolchain.

+

The components can be written in any language as long as the data +structures of the binary streams are adhered to. The current set of +components have been written in POSIX-compliant C++. This means that +they can be compiled in Linux and Windows using the latest GNU compiler +toolchain.

Components

-

The components in the Reference Model can be summarized as follows;

+

The components in the Reference Model can be summarized as +follows;

Usage

-

Standard piping syntax can be used to invoke the components at the command line. It is the same syntax in Windows DOS, Linux terminal or Cygwin (a Linux emulator for Windows). For example the following command invokes eve, getmodel, gulcalc, fmcalc, summarycalc and eltcalc, and exports an event loss table output to a csv file.

-
$ eve 1 1 | getmodel | gulcalc -r –S100 -a1 –i - | fmcalc | summarycalc -f -1 - | eltcalc > elt.csv -
-

Example python scripts are provided along with a binary data package in the /examples folder to demonstrate usage of the toolkit. For more guidance on how to use the toolkit, see Workflows.

-

Go to 2. Data streaming architecture overview

-

Back to Contents

- - - +

Standard piping syntax can be used to invoke the components at the +command line. It is the same syntax in Windows DOS, Linux terminal or +Cygwin (a Linux emulator for Windows). For example the following command +invokes eve, getmodel, gulcalc, fmcalc, summarycalc and eltcalc, and +exports an event loss table output to a csv file.

+
$ eve 1 1 | getmodel | gulcalc -r –S100 -a1 –i - | fmcalc | summarycalc -f -1 - | eltcalc > elt.csv
+

Example python scripts are provided along with a binary data package +in the /examples folder to demonstrate usage of the toolkit. For more +guidance on how to use the toolkit, see Workflows.

+

Go to 2. Data streaming architecture +overview

+

Back to Contents

+ + diff --git a/docs/html/MultiPeril.html b/docs/html/MultiPeril.html index 1ad7386a..c737f2cb 100644 --- a/docs/html/MultiPeril.html +++ b/docs/html/MultiPeril.html @@ -1,409 +1,254 @@ - - - -MultiPeril.md - - - - - - - - - - - - -

alt text

-

Appendix C: Multi-peril model support

-

ktools now supports multi-peril models through the introduction of the coverage_id in the data structures.

-

Ground up losses apply at the “Item” level in the Kernel which corresponds to “interest coverage” in business terms, which is the element of financial loss that can be associated with a particular asset. In ktools, item_id represents the element of financial loss and coverage_id represents the asset with its associated total insured value. If there is more than one item per coverage (as defined in the items data) then each item represents an element of financial loss from a particular peril contributing to the total loss for the asset. For each item, the identification of the peril is held in the areaperil_id, which is a unique key representing a combination of the location (area) and peril.

-

Multi-peril damage calculation

-

Ground up losses are calculated by multiplying the damage ratio for an item by the total insured value of its associated coverage (defined in the coverages data). The questions are then; how are these losses combined across items, and how are they correlated?

-

There are a few ways in which losses can be combined and the first example in ktools uses a simple rule, which is to sum the losses for each coverage and cap the overall loss to the total insured value. This is what you get when you use the -c parameter in gulcalc to output losses by 'coverage'.

-

In v3.1.0 the method of combining losses became function-driven using the gulcalc command line parameter -a as a few standard approaches have emerged. These are;

+ + + + + + + MultiPeril + + + +

alt text

+

Appendix C: Multi-peril +model support

+

ktools now supports multi-peril models through the introduction of +the coverage_id in the data structures.

+

Ground up losses apply at the “Item” level in the Kernel which +corresponds to “interest coverage” in business terms, which is the +element of financial loss that can be associated with a particular +asset. In ktools, item_id represents the element of financial loss and +coverage_id represents the asset with its associated total insured +value. If there is more than one item per coverage (as defined in the +items data) then each item represents an element of financial loss from +a particular peril contributing to the total loss for the asset. For +each item, the identification of the peril is held in the areaperil_id, +which is a unique key representing a combination of the location (area) +and peril.

+

Multi-peril damage +calculation

+

Ground up losses are calculated by multiplying the damage ratio for +an item by the total insured value of its associated coverage (defined +in the coverages data). The questions are then; how are these losses +combined across items, and how are they correlated?

+

There are a few ways in which losses can be combined and the first +example in ktools uses a simple rule, which is to sum the losses for +each coverage and cap the overall loss to the total insured value. This +is what you get when you use the -c parameter in gulcalc to output +losses by 'coverage'.

+

In v3.1.0 the method of combining losses became function-driven using +the gulcalc command line parameter -a as a few standard approaches have +emerged. These are;

++++ - - + + - - + + - - + + - - + + - - + +
Allocation optionDescriptionAllocation optionDescription
0Do nothing (suitable for single sub-peril models with one item per coverage)0Do nothing (suitable for single sub-peril +models with one item per coverage)
1Sum damage ratios and cap to 1. Back-allocate in proportion to contributing subperil loss1Sum damage ratios and cap to 1. +Back-allocate in proportion to contributing subperil loss
2Multiplicative method for combining damage. Back-allocate in proportion to contributing subperil loss2Total damage = maximum subperil damage. +Back-allocate all to the maximum contributing subperil loss
3Total damage = maximum subperil damage. Back-allocate all to the maximum contributing subperil loss3Multiplicative method for combining +damage. Back-allocate in proportion to contributing subperil loss
-

Allocation option 1 has been implemented in v3.1.0.

-

Correlation of item damage is generic in ktools, as damage can either be 100% correlated or independent (see Appendix A Random Numbers). This is no different in the multi-peril case when items represent different elements of financial loss to the same asset, rather than different assets. More sophisticated methods of multi-peril correlation have been implemented for particular models, but as yet no standard approach has been implemented in ktools.

-

Note that ground up losses by item can be passed into the financial module unallocated (allocation method 0) using the gulcalc option -i, or allocated using the gulcalc option -a1 -i. If the item ground up losses are passed though unallocated then the limit of total insured value must be applied as part of the financial module calculations, to prevent the ground up loss exceeding the coverage TIV.

+

Allocation options 0,1,2 have been implemented to date.

+

Correlation of item damage is generic in ktools, as damage can either +be 100% correlated or independent (see Appendix A Random Numbers). This is no +different in the multi-peril case when items represent different +elements of financial loss to the same asset, rather than different +assets. More sophisticated methods of multi-peril correlation have been +implemented for particular models, but as yet no standard approach has +been implemented in ktools.

+

Note that ground up losses by item can be passed into the financial +module unallocated (allocation method 0) using the gulcalc option -i, or +allocated using the gulcalc option -a1 -i. If the item ground up losses +are passed though unallocated then the limit of total insured value must +be applied as part of the financial module calculations, to prevent the +ground up loss exceeding the coverage TIV.

Return to top

-

Back to Contents

- - - +

Back to Contents

+ + diff --git a/docs/html/ORDOutputComponents.html b/docs/html/ORDOutputComponents.html new file mode 100644 index 00000000..ba380332 --- /dev/null +++ b/docs/html/ORDOutputComponents.html @@ -0,0 +1,1390 @@ + + + + + + + ORDOutputComponents + + + +

4.3 ORD Output Components +

+

As well as the set of legacy outputs described in +OutputComponents.html, ktools also supports Open Results Data "ORD" output +calculations and reports.

+

Open Results Data is a data standard for catastrophe loss model +results developed as part of Open Data Standards "ODS". ODS is curated +by OasisLMF and governed by the Open Data Standards Steering Committee +(SC), comprised of industry experts representing (re)insurers, brokers, +service providers and catastrophe model vendors. More information about +ODS can be found here.

+

ktools supports a subset of the fields in each of the ORD reports, +which are given in more detail below. In most cases, the existing +components for legacy outputs are used to generate ORD format outputs +when called with extra command line switches, although there is a +dedicated component call ordleccalc to generate all of the EPT reports. +In overview, here are the mappings from component to ORD report:

+ +

+

summarycalctocsv

+
+

Summarycalctocsv takes the summarycalc loss stream, which contains +the individual loss samples by event and summary_id, and outputs them in +ORD format. Summarycalc is a core component that aggregates the +individual building or coverage loss samples into groups that are of +interest from a reporting perspective. This is covered in Core Components

+
Parameters
+ +
Usage
+
$ [stdin component] | summarycalctocsv [parameters] > selt.csv
+$ summarycalctocsv [parameters] > selt.csv < [stdin].bin
+
Example
+
$ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | summarycalc -i -1 - | summarycalctocsv -o > selt.csv
+$ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | summarycalc -i -1 - | summarycalctocsv -p selt.parquet
+$ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | summarycalc -i -1 - | summarycalctocsv -p selt.parquet -o > selt.csv
+$ summarycalctocsv -o > selt.csv < summarycalc.bin
+$ summarycalctocsv -p selt.parquet < summarycalc.bin
+$ summarycalctocsv -p selt.parquet -o > selt.csv < summarycalc.bin
+
Internal data
+

None.

+
Output
+

The Sample ELT output is a csv file with the following fields;

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
EventIdint4Model event_id45567
SummaryIdint4SummaryId representing a grouping of +losses10
SampleIdint4The sample number2
Lossfloat4The loss sample13645.78
ImpactedExposurefloat4Exposure value impacted by the event for +the sample70000
+

eltcalc

+
+

The program calculates loss by SummaryId and EventId. There are two +variants (in addition to the sample variant SELT output by summarycalc, +above);

+ +
Parameters
+ +
Usage
+
$ [stdin component] | eltcalc -M [filename.csv] -Q [filename.csv] -m [filename.parquet] -q [filename.parquet]
+$ eltcalc  -M [filename.csv] -Q [filename.csv] -m [filename.parquet] -q [filename.parquet] < [stdin].bin
+
Example
+
$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | eltcalc -M MELT.csv -Q QELT.csv
+$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | eltcalc -m MELT.parquet -q QELT.parquet
+$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | eltcalc -M MELT.csv -Q QELT.csv -m MELT.parquet -q QELT.parquet
+$ eltcalc  -M MELT.csv -Q QELT.csv < summarycalc.bin
+$ eltcalc  -m MELT.parquet -Q QELT.parquet < summarycalc.bin
+$ eltcalc  -M MELT.csv -Q QELT.csv -m MELT.parquet -q QELT.parquet < summarycalc.bin
+
Internal data
+

The Quantile report requires the quantile.bin file

+ +
Calculation
+
MELT
+

For each SummaryId and EventId, the sample mean and standard +deviation is calculated from the sampled losses in the summarycalc +stream and output to file. The analytical mean is also output as a +seperate record, differentiated by a 'SampleType' field. Variations of +the exposure value are also output (see below for details).

+
QELT
+

For each SummaryId and EventId, this report provides the probability +and the corresponding loss quantile computed from the samples. The list +of probabilities is provided as input in the quantile.bin file.

+

Quantiles are cut points dividing the range of a probability +distribution into continuous intervals with equal probabilities, or +dividing the observations in a sample set in the same way. In this case +we are computing the quantiles of loss from the sampled losses by event +and summary for a user-provided list of probabilities. For each provided +probability p, the loss quantile is the sampled loss which is bigger +than the proportion p of the observed samples.

+

In practice this is calculated by sorting the samples in ascending +order of loss and using linear interpolation between the ordered +observations to compute the precise loss quantile for the required +probability.

+

The algorithm used for the quantile estimate type and interpolation +scheme from a finite sample set is R-7 referred to in Wikipedia https://en.wikipedia.org/wiki/Quantile

+

If p is the probability, and the sample size is N, then the position +of the ordered samples required for the quantile is computed by;

+

(N-1)p + 1

+

In general, this value will be a fraction rather than an integer, +representing a value in between two ordered samples. Therefore for an +integer value of k between 1 and N-1 with k < (N-1)p + 1 < k+1 , +the loss quantile Q(p) is calculated by a linear interpolation of the +kth ordered sample X(k) and the k+1 th ordered sample X(k+1) as +follows;

+

Q(p) = X(k) * (1-h) + X(k+1) * h

+

where h = (N-1)p + 1 - k

+
Output
+

The Moment ELT output is a csv file with the following fields;

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
EventIdint4Model event_id45567
SummaryIdint4SummaryId representing a grouping of +losses10
SampleTypeint41 for analytical mean, 2 for sample +mean2
EventRatefloat4Annual frequency of event computed by +relative frequency of occurrence0.01
ChanceOfLossfloat4Probability of a loss calculated from the +effective damage distributions0.95
MeanLossfloat4Mean1345.678
SDLossfloat4Sample standard deviation for +SampleType=2945.89
MaxLossfloat4Maximum possible loss calculated from the +effective damage distribution75000
FootprintExposurefloat4Exposure value impacted by the model's +event footprint80000
MeanImpactedExposurefloat4Mean exposure impacted by the event across +the samples (where loss > 0 )65000
MaxImpactedExposurefloat4Maximum exposure impacted by the event +across the samples (where loss > 0)70000
+

The Quantile ELT output is a csv file with the following fields;

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
EventIdint4Model event_id45567
SummaryIdint4SummaryId representing a grouping of +losses10
Quantilefloat4The probability associated with the loss +quantile0.9
Lossfloat4The loss quantile1345.678
+

Return to top

+

pltcalc

+
+

The program calculates loss by Period, EventId and SummaryId and +outputs the results in ORD format. There are three variants;

+ +
Parameters
+ +
Usage
+
$ [stdin component] | pltcalc -S [filename.csv] -M [filename.csv] -Q [filename.csv] -s [filename.parquet] -m [filename.parquet] -q [filename.parquet]
+$ pltcalc -S [filename.csv] -M [filename.csv] -Q [filename.csv] -s [filename.parquet] -m [filename.parquet] -q [filename.parquet] < [stdin].bin
+
Example
+
$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | pltcalc -S SPLT.csv -M MPLT.csv -Q QPLT.csv
+$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | pltcalc -s SPLT.parquet -m MPLT.parquet -q QPLT.parquet
+$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | pltcalc -S SPLT.csv -M MPLT.csv -Q QPLT.csv -s SPLT.parquet -m MPLT.parquet -q QPLT.parquet
+$ pltcalc -S SPLT.csv -M MPLT.csv -Q QPLT.csv < summarycalc.bin
+$ pltcalc -s SPLT.parquet -m MPLT.parquet -q QPLT.parquet < summarycalc.bin
+$ pltcalc -S SPLT.csv -M MPLT.csv -Q QPLT.csv -s SPLT.parquet -m MPLT.parquet -q QPLT.parquet < summarycalc.bin
+
Internal data
+

pltcalc requires the occurrence.bin file

+ +

The Quantile report additionally requires the quantile.bin file

+ +

pltcalc will optionally use the following file if present

+ +
Calculation
+
SPLT
+

For each Period, EventId and SummaryId, the individual loss samples +are output by SampleId. The sampled event losses from the summarycalc +stream are assigned to a Period for each occurrence of the EventId in +the occurrence file.

+
MPLT
+

For each Period, EventId and SummaryId, the sample mean and standard +deviation is calculated from the sampled event losses in the summarycalc +stream and output to file. The analytical mean is also output as a +seperate record, differentiated by a 'SampleType' field. Variations of +the exposure value are also output (see below for more details).

+
QPLT
+

For each Period, EventId and SummaryId, this report provides the +probability and the corresponding loss quantile computed from the +samples. The list of probabilities is provided in the quantile.bin +file.

+

See QELT for the method of computing the loss quantiles.

+
Output
+

The Sample PLT output is a csv with the folling fields

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
Periodint4The period in which the event occurs500
PeriodWeightint4The weight of the period (frequency +relative to the total number of periods)0.001
EventIdint4Model event_id45567
Yearint4The year in which the event occurs1970
Monthint4The month number in which the event +occurs5
Dayint4The day number in which the event +occurs22
Hourint4The hour in which the event occurs11
Minuteint4The minute in which the event occurs45
SummaryIdint4SummaryId representing a grouping of +losses10
SampleIdint4The sample number2
Lossfloat4The loss sample13645.78
ImpactedExposurefloat4Exposure impacted by the event for the +sample70000
+

The Moment PLT output is a csv file with the following fields;

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
Periodint4The period in which the event occurs500
PeriodWeightint4The weight of the period (frequency +relative to the total number of periods)0.001
EventIdint4Model event_id45567
Yearint4The year in which the event occurs1970
Monthint4The month number in which the event +occurs5
Dayint4The day number in which the event +occurs22
Hourint4The hour in which the event occurs11
Minuteint4The minute in which the event occurs45
SummaryIdint4SummaryId representing a grouping of +losses10
SampleTypeint41 for analytical mean, 2 for sample +mean2
ChanceOfLossfloat4Probability of a loss calculated from the +effective damage distributions0.95
MeanLossfloat4Mean1345.678
SDLossfloat4Sample standard deviation for +SampleType=2945.89
MaxLossfloat4Maximum possible loss calculated from the +effective damage distribution75000
FootprintExposurefloat4Exposure value impacted by the model's +event footprint80000
MeanImpactedExposurefloat4Mean exposure impacted by the event across +the samples (where loss > 0 )65000
MaxImpactedExposurefloat4Maximum exposure impacted by the event +across the samples (where loss > 0)70000
+

The Quantile PLT output is a csv file with the following fields;

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
Periodint4The period in which the event occurs500
PeriodWeightint4The weight of the period (frequency +relative to the total number of periods)0.001
EventIdint4Model event_id45567
Yearint4The year in which the event occurs1970
Monthint4The month number in which the event +occurs5
Dayint4The day number in which the event +occurs22
Hourint4The hour in which the event occurs11
Minuteint4The minute in which the event occurs45
SummaryIdint4SummaryId representing a grouping of +losses10
Quantilefloat4The probability associated with the loss +quantile0.9
Lossfloat4The loss quantile1345.678
+

Return to top

+

ordleccalc

+
+

This component produces several variants of loss exceedance curves, +known as Exceedance Probability Tables "EPT" under ORD.

+

An Exceedance Probability Table is a set of user-specified +percentiles of (typically) annual loss on one of two bases – AEP (sum of +losses from all events in a year) or OEP (maximum of any one event’s +losses in a year). In ORD the percentiles are expressed as Return +Periods, which is the reciprocal of the percentile.

+

How EPTs are derived in general depends on the mathematical +methodology of calculating the underlying ground up and insured +losses.

+

In the Oasis kernel the methodology is Monte Carlo sampling from +damage distributions, which results in several samples (realisations) of +an event loss for every event in the model's catalogue. The event losses +are assigned to a year timeline and the years are rank ordered by loss. +The method of computing the percentiles is by taking the ratio of the +frequency of years with a loss exceeding a given threshold over the +total number of years.

+

The OasisLMF approach gives rise to five variations of calculation of +these statistics:

+ +

Exceedance Probability Tables are further generalised in Oasis to +represent not only annual loss percentiles but loss percentiles over any +period of time. Thus the typical use of 'Year' label in outputs is +replaced by the more general term 'Period', which can be any period of +time as defined in the model data 'occurrence' file (although the normal +period of interest is a year).

+
Parameters
+ +

An optional parameter is;

+ +
Usage
+
$ ordleccalc [parameters] 
+
+
Examples
+
'First generate summarycalc binaries by running the core workflow, for the required summary set
+$ eve 1 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc1.bin
+$ eve 2 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc2.bin
+
+'Then run ordleccalc, pointing to the specified sub-directory of work containing summarycalc binaries.
+'Write aggregate and occurrence full uncertainty
+$ ordleccalc -Ksummary1 -F -f -O ept.csv
+$ ordleccalc -Ksummary1 -F -f -P ept.parquet
+$ ordleccalc -Ksummary1 -F -f -O ept.csv -P ept.parquet
+
+'Write occurrence per sample (PSEPT)
+$ ordleccalc -Ksummary1 -w -o psept.csv
+$ ordleccalc -Ksummary1 -w -p psept.parquet
+$ ordleccalc -Ksummary1 -w -o psept.csv -p psept.parquet
+
+'Write aggregate and occurrence per sample (written to PSEPT) and per sample mean (written to EPT file)
+$ ordleccalc -Ksummary1 -W -w -M -m -O ept.csv -o psept.csv
+$ ordleccalc -Ksummary1 -W -w -M -m -P ept.parquet -p psept.parquet
+$ ordleccalc -Ksummary1 -W -w -M -m -O ept.csv -o psept.csv -P ept.parquet -p psept.parquet
+
+'Write full output
+$ ordleccalc -Ksummary1 -F -f -W -w -S -s -M -m -O ept.csv -o psept.csv
+$ ordleccalc -Ksummary1 -F -f -W -w -S -s -M -m -P ept.parquet -p psept.parquet
+$ ordleccalc -Ksummary1 -F -f -W -w -S -s -M -m -O ept.csv -o pseept.csv -P ept.parquet -p psept.parquet
+
Internal data
+

ordleccalc requires the occurrence.bin file

+ +

and will optionally use the following additional files if present

+ +

ordleccalc does not have a standard input that can be streamed in. +Instead, it reads in summarycalc binary data from a file in a fixed +location. The format of the binaries must match summarycalc standard +output. The location is in the 'work' subdirectory of the present +working directory. For example;

+ +

The user must ensure the work subdirectory exists. The user may also +specify a subdirectory of /work to store these files. e.g.

+ +

The reason for ordleccalc not having an input stream is that the +calculation is not valid on a subset of events, i.e. within a single +process when the calculation has been distributed across multiple +processes. It must bring together all event losses before assigning +event losses to periods and ranking losses by period. The summarycalc +losses for all events (all processes) must be written to the /work +folder before running leccalc.

+
Calculation
+

All files with extension .bin from the specified subdirectory are +read into memory, as well as the occurrence.bin. The summarycalc losses +are grouped together and sampled losses are assigned to period according +to which period the events are assigned to in the occurrence file.

+

If multiple events occur within a period;

+ +

The 'EPType' field in the output identifies the basis of loss +exceedance curve.

+

The 'EPTypes' are;

+
    +
  1. OEP
  2. +
  3. OEP TVAR
  4. +
  5. AEP
  6. +
  7. AEP TVAR
  8. +
+

TVAR results are generated automatically if the OEP or AEP report is +selected in the analysis options. TVAR, or Tail Conditional Expectation +(TCE), is computed by averaging the rank ordered losses exceeding a +given return period loss from the respective OEP or AEP result.

+

Then the calculation differs by EPCalc type, as follows;

+
    +
  1. The mean damage loss (sidx = -1) is output as a standard +exceedance probability table. If the calculation is run with 0 samples, +then leccalc will still return the mean damage loss exceedance +curve.

  2. +
  3. Full uncertainty - all losses by period are rank ordered to +produce a single loss exceedance curve.

  4. +
  5. Per Sample mean - the return period losses from the Per Sample +EPT are averaged, which produces a single loss exceedance +curve.

  6. +
  7. Sample mean - the losses by period are first averaged across the +samples, and then a single loss exceedance table is created from the +period sample mean losses.

  8. +
+

All four of the above variants are output into the same file when +selected.

+

Finally, the fifth variant, the Per Sample EPT is output to a +separate file. In this case, for each sample, losses by period are rank +ordered to produce a loss exceedance curve for each sample.

+
Output
+

Exceedance Probability Tables (EPT)

+

csv files with the following fields;

+

Exceedance Probability Table (EPT)

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
SummaryIdint4identifier representing a summary level +grouping of losses10
EPCalcint41, 2, 3 or 4 with meanings as given +above2
EPTypeint41, 2, 3 or 4 with meanings as given +above1
ReturnPeriodfloat4return period interval250
lossfloat4loss exceedance threshold or TVAR for +return period546577.8
+

Per Sample Exceedance Probability Tables (PSEPT)

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
SummaryIdint4identifier representing a summary level +grouping of losses10
SampleIDint4Sample number20
EPTypeint41, 2, 3 or 43
ReturnPeriodfloat4return period interval250
lossfloat4loss exceedance threshold or TVAR for +return period546577.8
+
Period weightings
+

An additional feature of ordleccalc is available to vary the relative +importance of the period losses by providing a period weightings file to +the calculation. In this file, a weight can be assigned to each period +make it more or less important than neutral weighting (1 divided by the +total number of periods). For example, if the neutral weight for period +1 is 1 in 10000 years, or 0.0001, then doubling the weighting to 0.0002 +will mean that period's loss reoccurrence rate would double. Assuming no +other period losses, the return period of the loss of period 1 in this +example would be halved.

+

All period_nos must appear in the file from 1 to P (no gaps). There +is no constraint on the sum of weights. Periods with zero weight will +not contribute any losses to the loss exceedance curve.

+

This feature will be invoked automatically if the periods.bin file is +present in the input directory.

+

Return to top

+

aalcalc

+
+

aalcalc outputs the Average Loss Table (ALT) which contains the +average annual loss and standard deviation of annual loss by +SummaryId.

+

Two types of average and standard deviation of loss are calculated; +analytical (SampleType 1) and sample (SampleType 2). If the analysis is +run with zero samples, then only SampleType 1 statistics are +returned.

+
Internal data
+

aalcalc requires the occurrence.bin file

+ +

aalcalc does not have a standard input that can be streamed in. +Instead, it reads in summarycalc binary data from a file in a fixed +location. The format of the binaries must match summarycalc standard +output. The location is in the 'work' subdirectory of the present +working directory. For example;

+ +

The user must ensure the work subdirectory exists. The user may also +specify a subdirectory of /work to store these files. e.g.

+ +

The reason for aalcalc not having an input stream is that the +calculation is not valid on a subset of events, i.e. within a single +process when the calculation has been distributed across multiple +processes. It must bring together all event losses before assigning +event losses to periods and finally computing the final statistics.

+
Parameters
+ +
Usage
+
$ aalcalc [parameters] > alt.csv
+
Examples
+
'First generate summarycalc binaries by running the core workflow, for the required summary set
+$ eve 1 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc1.bin
+$ eve 2 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc2.bin
+
+'Then run aalcalc, pointing to the specified sub-directory of work containing summarycalc binaries.
+$ aalcalc -o -Ksummary1 > alt.csv
+$ aalcalc -p alt.parquet -Ksummary1
+$ allcalc -o -p alt.parquet -Ksummary1 > alt.csv
+
Output
+

csv file containing the following fields;

+ +++++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameTypeBytesDescriptionExample
SummaryIdint4SummaryId representing a grouping of +losses10
SampleTypeint41 for analytical statistics, 2 for sample +statistics1
MeanLossfloat8average annual loss6785.9
SDLossfloat8standard deviation of loss54657.8
+
Calculation
+

The occurrence file and summarycalc files from the specified +subdirectory are read into memory. Event losses are assigned to period +according to which period the events occur in and summed by period and +by sample.

+

For type 1, the mean and standard deviation of numerically integrated +mean period losses are calculated across the periods. For type 2 the +mean and standard deviation of the sampled period losses are calculated +across all samples (sidx > 1) and periods.

+
Period weightings
+

An additional feature of aalcalc is available to vary the relative +importance of the period losses by providing a period weightings file to +the calculation. In this file, a weight can be assigned to each period +make it more or less important than neutral weighting (1 divided by the +total number of periods). For example, if the neutral weight for period +1 is 1 in 10000 years, or 0.0001, then doubling the weighting to 0.0002 +will mean that period's loss reoccurrence rate would double and the loss +contribution to the average annual loss would double.

+

All period_nos must appear in the file from 1 to P (no gaps). There +is no constraint on the sum of weights. Periods with zero weight will +not contribute any losses to the AAL.

+

This feature will be invoked automatically if the periods.bin file is +present in the input directory.

+

Return to top

+

Go to 4.4 Data conversion +components section

+

Back to Contents

+ + diff --git a/docs/html/OutputComponents.html b/docs/html/OutputComponents.html index 8dcf4dc8..b03e91b2 100644 --- a/docs/html/OutputComponents.html +++ b/docs/html/OutputComponents.html @@ -1,489 +1,338 @@ - - - -OutputComponents.md - - - - - - - - - - - - -

4.2 Output Components

-

eltcalc

-
-

The program calculates mean and standard deviation of loss by summary_id and by event_id.

+ + + + + + + OutputComponents + + + +

4.2 Output Components +

+

eltcalc

+
+

The program calculates mean and standard deviation of loss by +summary_id and by event_id.

Parameters

None

Usage
-
$ [stdin component] | eltcalc > elt.csv -$ eltcalc < [stdin].bin > elt.csv -
+
$ [stdin component] | eltcalc > elt.csv
+$ eltcalc < [stdin].bin > elt.csv
Example
-
$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | eltcalc > elt.csv -$ eltcalc < summarycalc.bin > elt.csv -
+
$ eve 1 1 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - | eltcalc > elt.csv
+$ eltcalc < summarycalc.bin > elt.csv 
Internal data
-

No additional data is required, all the information is contained within the input stream.

+

No additional data is required, all the information is contained +within the input stream.

Calculation
-

For each summary_id and event_id, the sample mean and standard deviation is calculated from the sampled losses in the summarycalc stream and output to file. The analytical mean is also output as a seperate record, differentiated by a 'type' field. The exposure_value, which is carried in the event_id, summary_id header of the stream is also output.

+

For each summary_id and event_id, the sample mean and standard +deviation is calculated from the sampled losses in the summarycalc +stream and output to file. The analytical mean is also output as a +seperate record, differentiated by a 'type' field. The exposure_value, +which is carried in the event_id, summary_id header of the stream is +also output.

Output

csv file with the following fields;

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
summary_idsummary_id int 4summary_id representing a grouping of losses10summary_id representing a grouping of +losses10
typetype int 41 for analytical mean, 2 for sample mean21 for analytical mean, 2 for sample +mean2
event_idevent_id int 4Oasis event_id45567Oasis event_id45567
meanmean float 4mean1345.678mean1345.678
standard_deviationstandard_deviation float 4sample standard deviation945.89sample standard deviation945.89
exposure_valueexposure_value float 4exposure value for summary_id affected by the event70000exposure value for summary_id affected by +the event70000

Return to top

-

leccalc

-
-

Loss exceedance curves, also known as exceedance probability curves, are computed by a rank ordering a set of losses by period and computing the probability of exceedance for each level of loss based on relative frequency. Losses are first assigned to periods of time (typically years) by reference to the occurrence file which contains the event occurrences in each period over a timeline of, say, 10,000 periods. Event losses are summed within each period for an aggregate loss exceedance curve, or the maximum of the event losses in each period is taken for an occurrence loss exceedance curve. From this point, there are a few variants available as follows;

+

leccalc

+
+

Loss exceedance curves, also known as exceedance probability curves, +are computed by a rank ordering a set of losses by period and computing +the probability of exceedance for each level of loss based on relative +frequency. Losses are first assigned to periods of time (typically +years) by reference to the occurrence file which +contains the event occurrences in each period over a timeline of, say, +10,000 periods. Event losses are summed within each period for an +aggregate loss exceedance curve, or the maximum of the event losses in +each period is taken for an occurrence loss exceedance curve. From this +point, there are a few variants available as follows;

-

The ranked losses represent the first, second, third, etc.. largest loss periods within the total number of periods of say 10,000 years. The relative frequency of these periods of loss is interpreted as the probability of loss exceedance, that is to say that the top ranked loss has an exceedance probability of 1 in 10000, or 0.01%, the second largest loss has an exceedance probability of 0.02%, and so on. In the output file, the exceedance probability is expressed as a return period, which is the reciprocal of the exceedance probability multiplied by the total number of periods. Only non-zero loss periods are returned.

-
Parameters
+

The ranked losses represent the first, second, third, etc.. largest +loss periods within the total number of periods of say 10,000 years. The +relative frequency of these periods of loss is interpreted as the +probability of loss exceedance, that is to say that the top ranked loss +has an exceedance probability of 1 in 10000, or 0.01%, the second +largest loss has an exceedance probability of 0.02%, and so on. In the +output file, the exceedance probability is expressed as a return period, +which is the reciprocal of the exceedance probability multiplied by the +total number of periods. Only non-zero loss periods are returned.

+
Parameters

An optional parameter is;

-
Usage
-
$ leccalc [parameters] > lec.csv - -
+
Usage
+
$ leccalc [parameters] > lec.csv
+
Examples
-
'First generate summarycalc binaries by running the core workflow, for the required summary set +
'First generate summarycalc binaries by running the core workflow, for the required summary set
 $ eve 1 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc1.bin
 $ eve 2 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc2.bin
-'Then run leccalc, pointing to the specified sub-directory of work containing summarycalc binaries.
+'Then run leccalc, pointing to the specified sub-directory of work containing summarycalc binaries.
 $ leccalc -Ksummary1 -F lec_full_uncertainty_agg.csv -f lec_full_uncertainty_occ.csv 
-' With return period file
-$  leccalc -r -Ksummary1 -F lec_full_uncertainty_agg.csv -f lec_full_uncertainty_occ.csv 
-
-
Internal data
+' With return period file +$ leccalc -r -Ksummary1 -F lec_full_uncertainty_agg.csv -f lec_full_uncertainty_occ.csv
+
Internal data

leccalc requires the occurrence.bin file

-

leccalc does not have a standard input that can be streamed in. Instead, it reads in summarycalc binary data from a file in a fixed location. The format of the binaries must match summarycalc standard output. The location is in the 'work' subdirectory of the present working directory. For example;

+

leccalc does not have a standard input that can be streamed in. +Instead, it reads in summarycalc binary data from a file in a fixed +location. The format of the binaries must match summarycalc standard +output. The location is in the 'work' subdirectory of the present +working directory. For example;

-

The user must ensure the work subdirectory exists. The user may also specify a subdirectory of /work to store these files. e.g.

+

The user must ensure the work subdirectory exists. The user may also +specify a subdirectory of /work to store these files. e.g.

-

The reason for leccalc not having an input stream is that the calculation is not valid on a subset of events, i.e. within a single process when the calculation has been distributed across multiple processes. It must bring together all event losses before assigning event losses to periods and ranking losses by period. The summarycalc losses for all events (all processes) must be written to the /work folder before running leccalc.

-
Calculation
-

All files with extension .bin from the specified subdirectory are read into memory, as well as the occurrence.bin. The summarycalc losses are grouped together and sampled losses are assigned to period according to which period the events occur in.

+

The reason for leccalc not having an input stream is that the +calculation is not valid on a subset of events, i.e. within a single +process when the calculation has been distributed across multiple +processes. It must bring together all event losses before assigning +event losses to periods and ranking losses by period. The summarycalc +losses for all events (all processes) must be written to the /work +folder before running leccalc.

+
Calculation
+

All files with extension .bin from the specified subdirectory are +read into memory, as well as the occurrence.bin. The summarycalc losses +are grouped together and sampled losses are assigned to period according +to which period the events occur in.

If multiple events occur within a period;

Then the calculation differs by lec type, as follows;

-

For all curves, the analytical mean loss (sidx = -1) is output as a separate exceedance probability curve. If the calculation is run with 0 samples, then leccalc will still return the analytical mean loss exceedance curve. The 'type' field in the output identifies the type of loss exceedance curve, which is 1 for analytical mean, and 2 for curves calculated from the samples.

-
Output
+

For all curves, the analytical mean loss (sidx = -1) is output as a +separate exceedance probability curve. If the calculation is run with 0 +samples, then leccalc will still return the analytical mean loss +exceedance curve. The 'type' field in the output identifies the type of +loss exceedance curve, which is 1 for analytical mean, and 2 for curves +calculated from the samples.

+
Output

csv file with the following fields;

-

Full uncertainty, Sample mean and Wheatsheaf mean loss exceedance curve

+

Full uncertainty, Sample mean and Wheatsheaf mean loss +exceedance curve

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
summary_idsummary_id int 4summary_id representing a grouping of losses10summary_id representing a grouping of +losses10
typetype int 41 for analytical mean, 2 for sample mean21 for analytical mean, 2 for sample +mean2
return_periodreturn_period float 4return period interval250return period interval250
lossloss float 4loss exceedance threshold for return period546577.8loss exceedance threshold for return +period546577.8

Wheatsheaf loss exceedance curve

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
summary_idsummary_id int 4summary_id representing a grouping of losses10summary_id representing a grouping of +losses10
sidxsidx int 4Oasis sample index50Oasis sample index50
return_periodreturn_period float 4return period interval250return period interval250
lossloss float 4loss exceedance threshold for return period546577.8loss exceedance threshold for return +period546577.8
Period weightings
-

An additional feature of leccalc is available to vary the relative importance of the period losses by providing a period weightings file to the calculation. In this file, a weight can be assigned to each period make it more or less important than neutral weighting (1 divided by the total number of periods). For example, if the neutral weight for period 1 is 1 in 10000 years, or 0.0001, then doubling the weighting to 0.0002 will mean that period's loss reoccurrence rate would double. Assuming no other period losses, the return period of the loss of period 1 in this example would be halved.

-

All period_nos must appear in the file from 1 to P (no gaps). There is no constraint on the sum of weights. Periods with zero weight will not contribute any losses to the loss exceedance curve.

-

This feature will be invoked automatically if the periods.bin file is present in the input directory.

+

An additional feature of leccalc is available to vary the relative +importance of the period losses by providing a period weightings file to +the calculation. In this file, a weight can be assigned to each period +make it more or less important than neutral weighting (1 divided by the +total number of periods). For example, if the neutral weight for period +1 is 1 in 10000 years, or 0.0001, then doubling the weighting to 0.0002 +will mean that period's loss reoccurrence rate would double. Assuming no +other period losses, the return period of the loss of period 1 in this +example would be halved.

+

All period_nos must appear in the file from 1 to P (no gaps). There +is no constraint on the sum of weights. Periods with zero weight will +not contribute any losses to the loss exceedance curve.

+

This feature will be invoked automatically if the periods.bin file is +present in the input directory.

Return to top

-

pltcalc

-
-

The program outputs sample mean and standard deviation by summary_id, event_id and period_no. The analytical mean is also output as a seperate record, differentiated by a 'type' field. It also outputs an event occurrence date.

-
Parameters
+

pltcalc

+
+

The program outputs sample mean and standard deviation by summary_id, +event_id and period_no. The analytical mean is also output as a seperate +record, differentiated by a 'type' field. It also outputs an event +occurrence date.

+
Parameters

None

-
Usage
-
$ [stdin component] | pltcalc > plt.csv -$ pltcalc < [stdin].bin > plt.csv -
+
Usage
+
$ [stdin component] | pltcalc > plt.csv
+$ pltcalc < [stdin].bin > plt.csv
Examplea
-
$ eve 1 1 | getmodel | gulcalc -r -S100 -C1 | summarycalc -1 - | pltcalc > plt.csv -$ pltcalc < summarycalc.bin > plt.csv -
-
Internal data
+
$ eve 1 1 | getmodel | gulcalc -r -S100 -C1 | summarycalc -1 - | pltcalc > plt.csv
+$ pltcalc < summarycalc.bin > plt.csv 
+
Internal data

pltcalc requires the occurrence.bin file

-
Calculation
-

The occurrence.bin file is read into memory. For each summary_id, event_id and period_no, the sample mean and standard deviation is calculated from the sampled losses in the summarycalc stream and output to file. The exposure_value, which is carried in the event_id, summary_id header of the stream is also output, as well as the date field(s) from the occurrence file.

-
Output
-

There are two output formats, depending on whether an event occurrence date is an integer offset to some base date that most external programs can interpret as a real date, or a calendar day in a numbered scenario year. The output format will depend on the format of the date fields in the occurrence.bin file.

+
Calculation
+

The occurrence.bin file is read into memory. For each summary_id, +event_id and period_no, the sample mean and standard deviation is +calculated from the sampled losses in the summarycalc stream and output +to file. The exposure_value, which is carried in the event_id, +summary_id header of the stream is also output, as well as the date +field(s) from the occurrence file.

+
Output
+

There are two output formats, depending on whether an event +occurrence date is an integer offset to some base date that most +external programs can interpret as a real date, or a calendar day in a +numbered scenario year. The output format will depend on the format of +the date fields in the occurrence.bin file.

In the former case, the output format is;

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
typetype int 41 for analytical mean, 2 for sample mean11 for analytical mean, 2 for sample +mean1
summary_idsummary_id int 4summary_id representing a grouping of losses10summary_id representing a grouping of +losses10
event_idevent_id int 4Oasis event_id45567Oasis event_id45567
period_noperiod_no int 4identifying an abstract period of time, such as a year56876identifying an abstract period of time, +such as a year56876
meanmean float 4mean1345.678mean1345.678
standard_deviationstandard_deviation float 4sample standard deviation945.89sample standard deviation945.89
exposure_valueexposure_value float 4exposure value for summary_id affected by the event70000exposure value for summary_id affected by +the event70000
date_iddate_id int 4the date_id of the event occurrence28616the date_id of the event occurrence28616
-

Using a base date of 1/1/1900 the integer 28616 is interpreted as 16/5/1978.

+

Using a base date of 1/1/1900 the integer 28616 is interpreted as +16/5/1978.

In the latter case, the output format is;

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
typetype int 41 for analytical mean, 2 for sample mean11 for analytical mean, 2 for sample +mean1
summary_idsummary_id int 4summary_id representing a grouping of losses10summary_id representing a grouping of +losses10
event_idevent_id int 4Oasis event_id45567Oasis event_id45567
period_noperiod_no int 4identifying an abstract period of time, such as a year56876identifying an abstract period of time, +such as a year56876
meanmean float 4mean1345.678mean1345.678
standard_deviationstandard_deviation float 4sample standard deviation945.89sample standard deviation945.89
exposure_valueexposure_value float 4exposure value for summary_id affected by the event70000exposure value for summary_id affected by +the event70000
occ_yearocc_year int 4the year number of the event occurrence56876the year number of the event +occurrence56876
occ_monthocc_month int 4the month of the event occurrence5the month of the event occurrence5
occ_yearocc_year int 4the day of the event occurrence16the day of the event occurrence16

Return to top

-

aalcalc

-
-

aalcalc computes the overall average annual loss and standard deviation of annual loss.

-

Two types of aal and standard deviation of loss are calculated; analytical (type 1) and sample (type 2). If the analysis is run with zero samples, then only type 1 statistics are returned by aalcalc.

-

The Average Loss Converence Table 'ALCT' is a second optional output which can be generated from aalcalc. This provides extra statistical output which can be used to estimate the amount of simulation error in the average annual loss estimate from samples (type 2).

-
Internal data
+

aalcalc

+
+

aalcalc computes the overall average annual loss and standard +deviation of annual loss.

+

Two types of aal and standard deviation of loss are calculated; +analytical (type 1) and sample (type 2). If the analysis is run with +zero samples, then only type 1 statistics are returned by aalcalc.

+

The Average Loss Converence Table 'ALCT' is a second optional output +which can be generated from aalcalc. This provides extra statistical +output which can be used to estimate the amount of simulation error in +the average annual loss estimate from samples (type 2).

+
Internal data

aalcalc requires the occurrence.bin file

-

aalcalc does not have a standard input that can be streamed in. Instead, it reads in summarycalc binary data from a file in a fixed location. The format of the binaries must match summarycalc standard output. The location is in the 'work' subdirectory of the present working directory. For example;

+

aalcalc does not have a standard input that can be streamed in. +Instead, it reads in summarycalc binary data from a file in a fixed +location. The format of the binaries must match summarycalc standard +output. The location is in the 'work' subdirectory of the present +working directory. For example;

-

The user must ensure the work subdirectory exists. The user may also specify a subdirectory of /work to store these files. e.g.

+

The user must ensure the work subdirectory exists. The user may also +specify a subdirectory of /work to store these files. e.g.

-

The reason for aalcalc not having an input stream is that the calculation is not valid on a subset of events, i.e. within a single process when the calculation has been distributed across multiple processes. It must bring together all event losses before assigning event losses to periods and finally computing the final statistics.

-
Parameters
+

The reason for aalcalc not having an input stream is that the +calculation is not valid on a subset of events, i.e. within a single +process when the calculation has been distributed across multiple +processes. It must bring together all event losses before assigning +event losses to periods and finally computing the final statistics.

+
Parameters
-
Usage
-
$ aalcalc [parameters] > aal.csv -
-
Examples
-
First generate summarycalc binaries by running the core workflow, for the required summary set +
Usage
+
$ aalcalc [parameters] > aal.csv
+
Examples
+
First generate summarycalc binaries by running the core workflow, for the required summary set
 $ eve 1 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc1.bin
 $ eve 2 2 | getmodel | gulcalc -r -S100 -c - | summarycalc -g -1 - > work/summary1/summarycalc2.bin
 Then run aalcalc, pointing to the specified sub-directory of work containing summarycalc binaries.
 $ aalcalc -Ksummary1 > aal.csv  
 Add alct output at 95% confidence level
-$ aalcalc -Ksummary1 -o -l 0.95 -c alct.csv > aal.csv  
-
-
Output
+$ aalcalc -Ksummary1 -o -l 0.95 -c alct.csv > aal.csv
+
Output

AAL:

csv file containing the following fields;

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
summary_idsummary_id int 4summary_id representing a grouping of losses1summary_id representing a grouping of +losses1
typetype int 41 for analytical statistics, 2 for sample statistics21 for analytical statistics, 2 for sample +statistics2
meanmean float 8average annual loss1014.23average annual loss1014.23
standard_deviationstandard_deviation float 8standard deviation of annual loss11039.78standard deviation of annual loss11039.78

ALCT:

csv file containing the following fields;

+++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
NameName Type BytesDescriptionExampleDescriptionExample
SummaryIdSummaryId int 4summary_id representing a grouping of losses1summary_id representing a grouping of +losses1
MeanLossMeanLoss float 4the average annual loss estimate from samples1014.23the average annual loss estimate from +samples1014.23
SDLossSDLoss float 8the standard deviation of annual loss from samples11039.78the standard deviation of annual loss from +samples11039.78
SampleSizeSampleSize int 8the number of samples used to produce the statistics100the number of samples used to produce the +statistics100
LowerCILowerCI float 8the lower threshold of the confidence interval for the mean estimate1004.52the lower threshold of the confidence +interval for the mean estimate1004.52
UpperCIUpperCI float 8the upper threshold of the confidence interval for the mean estimate1023.94the upper threshold of the confidence +interval for the mean estimate1023.94
StandardErrorStandardError float 8the total standard error of the mean estimate5.90the total standard error of the mean +estimate5.90
RelativeErrorRelativeError float 8the StandardError divided by the mean estimate0.005the StandardError divided by the mean +estimate0.005
VarElementHazVarElementHaz float 8the contribution to variance of the estimate from the hazard8707.40the contribution to variance of the +estimate from the hazard8707.40
StandardErrorHazStandardErrorHaz float 8the square root of VarElementHaz93.31the square root of VarElementHaz93.31
RelativeErrorHazRelativeErrorHaz float 8the StandardErrorHaz divided by the mean estimate0.092the StandardErrorHaz divided by the mean +estimate0.092
VarElementVulnVarElementVuln float 8the contribution to variance of the estimate from the vulnerability34.81the contribution to variance of the +estimate from the vulnerability34.81
StandardErrorVulnStandardErrorVuln float 8the square root of VarElementVuln5.90the square root of VarElementVuln5.90
RelativeErrorVulnRelativeErrorVuln float 8the StandardErrorVuln divided by the mean estimate0.005the StandardErrorVuln divided by the mean +estimate0.005
-
Calculation
-

The occurrence file and summarycalc files from the specified subdirectory are read into memory. Event losses are assigned to periods based on when the events occur and summed by period and by sample. These are referred to as 'annual loss samples'.

+
Calculation
+

The occurrence file and summarycalc files from the specified +subdirectory are read into memory. Event losses are assigned to periods +based on when the events occur and summed by period and by sample. These +are referred to as 'annual loss samples'.

AAL calculation:

-

For type 1, calculations are performed on the type 1 (numerically integrated) mean annual losses by period. The AAL is the mean annual losses summed across the periods and divided by the number of periods. The standard deviation is the square root of the sum of squared errors between each annual mean loss and the AAL mean divided by the degrees of freedom (periods - 1).

-

For type 2 the mean and standard deviation of the annual loss samples are calculated across all samples and periods. The mean estimates the average annual loss, calculated as the sum of all annual loss samples divided by the total number of periods times the number of samples. The standard deviation is the square root of the sum of squared errors between each annual loss sample and the type 2 mean, divided by the degrees of freedom (periods × samples - 1).

+

For type 1, calculations are performed on the type 1 (numerically +integrated) mean annual losses by period. The AAL is the mean annual +losses summed across the periods and divided by the number of periods. +The standard deviation is the square root of the sum of squared errors +between each annual mean loss and the AAL mean divided by the degrees of +freedom (periods - 1).

+

For type 2 the mean and standard deviation of the annual loss samples +are calculated across all samples and periods. The mean estimates the +average annual loss, calculated as the sum of all annual loss samples +divided by the total number of periods times the number of samples. The +standard deviation is the square root of the sum of squared errors +between each annual loss sample and the type 2 mean, divided by the +degrees of freedom (periods × samples - 1).

ALCT calculation:

-

In ALCT, MeanLoss and SDLoss are the same as the type 2 mean and standard deviation from the AAL report. StandardError indicates how much the average annual loss estimate might vary if the simulation were rerun with different random numbers, reflecting the simulation error in the estimate. RelativeError, the StandardError as a percentage of the mean, is convenient for assessing simulation error and acceptable levels are typically expressed in percentage terms. StandardError is derived from the ANOVA metrics described below.

-

LowerCI and UpperCI represent the absolute lower and upper thresholds for the confidence interval for the AAL estimate, indicating the range of losses within a specified confidence level. A higher confidence level results in a wider confidence interval.

+

In ALCT, MeanLoss and SDLoss are the same as the type 2 mean and +standard deviation from the AAL report. StandardError indicates how much +the average annual loss estimate might vary if the simulation were rerun +with different random numbers, reflecting the simulation error in the +estimate. RelativeError, the StandardError as a percentage of the mean, +is convenient for assessing simulation error and acceptable levels are +typically expressed in percentage terms. StandardError is derived from +the ANOVA metrics described below.

+

LowerCI and UpperCI represent the absolute lower and upper thresholds +for the confidence interval for the AAL estimate, indicating the range +of losses within a specified confidence level. A higher confidence level +results in a wider confidence interval.

Variance components:

-

VarElementHaz and VarElementVuln arise from attributing variance in the annual loss to hazard effects (variation due to event intensity across years) and vulnerability effects (variation due to sampling from exposure's damage uncertainty distributions). This is done using a one-factor effects model and standard analysis of variance 'ANOVA' on the annual loss samples.

-

In the one-factor model, annual loss in year i and sample m, denoted L(i,m), is expressed as:

+

VarElementHaz and VarElementVuln arise from attributing variance in +the annual loss to hazard effects (variation due to event intensity +across years) and vulnerability effects (variation due to sampling from +exposure's damage uncertainty distributions). This is done using a +one-factor effects model and standard analysis of variance 'ANOVA' on +the annual loss samples.

+

In the one-factor model, annual loss in year i and sample m, denoted +L(i,m), is expressed as:

L(i,m) = AAL + h(i) + v(i,m)

where;

-

Total variance in annual loss is partitioned into independent hazard and vulnerability effects:

+

Total variance in annual loss is partitioned into independent hazard +and vulnerability effects:

Var(L) = Var(h) + Var(v)

-

ANOVA is used to estimate the variance components Var(h) and Var(v). For standard Oasis models, since the events are fixed across years, the simulation error in the AAL estimate arises only from the vulnerability component.

-

The StandardError of the AAL estimate in ALCT follows from the calculation of the Variance of the AAL estimate as follows;

+

ANOVA is used to estimate the variance components Var(h) and Var(v). +For standard Oasis models, since the events are fixed across years, the +simulation error in the AAL estimate arises only from the vulnerability +component.

+

The StandardError of the AAL estimate in ALCT follows from the +calculation of the Variance of the AAL estimate as follows;

Var(AAL estimate) = VarElementVuln = Var(v) / (I * M)

StandardErrorVuln = sqrt(VarElementVuln)

StandardError = StandardErrorVuln

-

Finally, ALCT provides statistics for multiple increasing sample subsets, showing the convergence of the AAL estimate with increasing sample sizes. These subsets are non-overlapping and fixed, starting with SampleSize=1 (m=1), SampleSize=2 (m=2 to 3), SampleSize=4 (m=4 to 7), up to the maximum subset size. The final row gives statistics for the total samples M, using all available samples.

-
Period weightings
-

An additional feature of aalcalc is available to vary the relative importance of the period losses by providing a period weightings file to the calculation. In this file, a weight can be assigned to each period make it more or less important than neutral weighting (1 divided by the total number of periods). For example, if the neutral weight for period 1 is 1 in 10000 years, or 0.0001, then doubling the weighting to 0.0002 will mean that period's loss reoccurrence rate would double and the loss contribution to the average annual loss would double.

-

All period_nos must appear in the file from 1 to P (no gaps). There is no constraint on the sum of weights. Periods with zero weight will not contribute any losses to the AAL.

-

This feature will be invoked automatically if the periods.bin file is present in the input directory.

+

Finally, ALCT provides statistics for multiple increasing sample +subsets, showing the convergence of the AAL estimate with increasing +sample sizes. These subsets are non-overlapping and fixed, starting with +SampleSize=1 (m=1), SampleSize=2 (m=2 to 3), SampleSize=4 (m=4 to 7), up +to the maximum subset size. The final row gives statistics for the total +samples M, using all available samples.

+
Period weightings
+

An additional feature of aalcalc is available to vary the relative +importance of the period losses by providing a period weightings file to +the calculation. In this file, a weight can be assigned to each period +make it more or less important than neutral weighting (1 divided by the +total number of periods). For example, if the neutral weight for period +1 is 1 in 10000 years, or 0.0001, then doubling the weighting to 0.0002 +will mean that period's loss reoccurrence rate would double and the loss +contribution to the average annual loss would double.

+

All period_nos must appear in the file from 1 to P (no gaps). There +is no constraint on the sum of weights. Periods with zero weight will +not contribute any losses to the AAL.

+

This feature will be invoked automatically if the periods.bin file is +present in the input directory.

Return to top

-

Parameters

+

Parameters

Optional parameter for aalcalc;

-

kat

-
-

In cases where events have been distributed to multiple processes, the output files can be concatenated to standard output.

-

Parameters

+

kat

+
+

In cases where events have been distributed to multiple processes, +the output files can be concatenated to standard output.

+

Parameters

Optional parameters are:

-

The sort by event ID option assumes that events have not been distributed to processes randomly and the list of event IDs in events.bin is sequential and contiguous. Should either of these conditions be false, the output will still contain all events but sorting cannot be guaranteed.

-

Usage

-
$ kat [parameters] [file]... > [stdout component] -
-

Examples

-
$ kat -d pltcalc_output/ > pltcalc.csv +

The sort by event ID option assumes that events have not been +distributed to processes randomly and the list of event IDs in +events.bin is sequential and contiguous. Should either of these +conditions be false, the output will still contain all events but +sorting cannot be guaranteed.

+

Usage

+
$ kat [parameters] [file]... > [stdout component]
+

Examples

+
$ kat -d pltcalc_output/ > pltcalc.csv
 $ kat eltcalc_P1 eltcalc_P2 eltcalc_P3 > eltcalc.csv
 $ kat -s eltcalc_P1 eltcalc_P2 eltcalc_P3 > eltcalc.csv
-$ kat -s -d eltcalc_output/ > eltcalc.csv
-
-

Files are concatenated in the order in which they are presented on the command line. Should a file path be specified, files are concatenated in alphabetical order. When asked to sort by event ID, the order of input files is irrelevant.

+$ kat -s -d eltcalc_output/ > eltcalc.csv
+

Files are concatenated in the order in which they are presented on +the command line. Should a file path be specified, files are +concatenated in alphabetical order. When asked to sort by event ID, the +order of input files is irrelevant.

Return to top

-

katparquet

-
-

The output parquet files from multiple processes can be concatenated to a single parquet file. The results are automatically sorted by event ID. Unlike kat, the ORD table name for the input files must be specified on the command line.

-

Parameters

+

katparquet

+
+

The output parquet files from multiple processes can be concatenated +to a single parquet file. The results are automatically sorted by event +ID. Unlike kat, the ORD table name for the input files +must be specified on the command line.

+

Parameters

-

Usage

-
$ katparquet [parameters] -o [filename.parquet] [file]... -
-

Examples

-
$ katparquet -d mplt_files/ -M -o MPLT.parquet -$ katparquet -q -o QPLT.parquet qplt_P1.parquet qplt_P2.parquet qplt_P3.parquet -
+

Usage

+
$ katparquet [parameters] -o [filename.parquet] [file]...
+

Examples

+
$ katparquet -d mplt_files/ -M -o MPLT.parquet
+$ katparquet -q -o QPLT.parquet qplt_P1.parquet qplt_P2.parquet qplt_P3.parquet

Return to top

-

Go to 4.3 Data conversion components section

-

Back to Contents

- - - +

Go to 4.3 Data conversion +components section

+

Back to Contents

+ + diff --git a/docs/html/Overview.html b/docs/html/Overview.html index f75ae0c6..7b1f879c 100644 --- a/docs/html/Overview.html +++ b/docs/html/Overview.html @@ -1,395 +1,234 @@ - - - -Overview.md - - - - - - - - - - - - -

alt text

-

2. Data Streaming Framework Overview

-

This is the general data streaming framework showing the core components of the toolkit.

-
Figure 1. Data streaming framework
-

alt text

+ + + + + + + Overview + + + +

alt text

+

2. Data Streaming Framework +Overview

+

This is the general data streaming framework showing the core +components of the toolkit.

+
Figure 1. Data streaming +framework
+

The architecture consists of;

-

The conversion of input data to binary format is shown in the diagram as occurring outside of the compute server, but this could be performed within the compute server. ktools provides a full set of binary conversion tools from csv input files which can be deployed elsewhere.

-

The in-memory data streams are initiated by the process 'eve' (meaning 'event emitter') and shown by solid arrows. The read/write data flows are shown as dashed arrows.

-

The calculation components are getmodel, gulcalc, fmcalc, summarycalc and outputcalc. The streamed data passes through the components in memory one event at a time and are written out to a results file on the compute server. The user can then retrieve the results (csvs) and consume them in their BI system.

-

The reference model demonstrates an implementation of the core calculation components, along with the data conversion components which convert binary files to csv files.

-

The analysis workflows are controlled by the user, not the toolkit, and they can be as simple or as complex as required.

-

The simplest workflow is single or parallel processing to produce a single result. This minimises the amount of disk I/O at each stage in the calculation, which performs better than saving intermediate results to disk. This workflow is shown in Figure 2.

-
Figure 2. Single output processing
-

alt text

-

However it is possible to stream data from one process into to several processes, allowing the calculation of multiple outputs simultaneously, as shown in Figure 3.

-
Figure 3. Multiple output processing
-

alt text

-

For multi-output, multi-process workflows, Linux operating systems provide 'named pipes' which in-memory data streams can be diverted to and manipulated as if they were files, and 'tee' which sends a stream from one process into multiple processes. This means the core calculation is not repeated for each output, as it would be if several single-output workflows were run.

-

Go to 3. Specification

-

Back to Contents

- - - +

The conversion of input data to binary format is shown in the diagram +as occurring outside of the compute server, but this could be performed +within the compute server. ktools provides a full set of binary +conversion tools from csv input files which can be deployed +elsewhere.

+

The in-memory data streams are initiated by the process 'eve' +(meaning 'event emitter') and shown by solid arrows. The read/write data +flows are shown as dashed arrows.

+

The calculation components are getmodel, gulcalc, +fmcalc, summarycalc and outputcalc. The +streamed data passes through the components in memory one event at a +time and are written out to a results file on the compute server. The +user can then retrieve the results (csvs) and consume them in their BI +system.

+

The reference model demonstrates an implementation of the core +calculation components, along with the data conversion components which +convert binary files to csv files.

+

The analysis workflows are controlled by the user, not the toolkit, +and they can be as simple or as complex as required.

+

The simplest workflow is single or parallel processing to produce a +single result. This minimises the amount of disk I/O at each stage in +the calculation, which performs better than saving intermediate results +to disk. This workflow is shown in Figure 2.

+
Figure 2. Single output +processing
+

+

However it is possible to stream data from one process into to +several processes, allowing the calculation of multiple outputs +simultaneously, as shown in Figure 3.

+
Figure 3. Multiple output +processing
+

+

For multi-output, multi-process workflows, Linux operating systems +provide 'named pipes' which in-memory data streams can be diverted to +and manipulated as if they were files, and 'tee' which sends a stream +from one process into multiple processes. This means the core +calculation is not repeated for each output, as it would be if several +single-output workflows were run.

+

Go to 3. Specification

+

Back to Contents

+ + diff --git a/docs/html/README.html b/docs/html/README.html index 19bfee58..6ea91dc5 100644 --- a/docs/html/README.html +++ b/docs/html/README.html @@ -1,447 +1,335 @@ - - - -README.md - - - - - - - - - - - - -Oasis LMF logo -

ktools version -Travis (.com) branch -Build Status

+ + + + + + + README + + + +

Oasis LMF logo

+

ktools

This is the POSIX-compliant Oasis LMF In-Memory Kernel toolkit.

Release

-

Please click here to download the latest release.

-

The source code will change on a regular basis but only the releases are supported. Support enquiries should be sent to support@oasislmf.org.

+

Please click here to download +the latest release.

+

The source code will change on a regular basis but only the releases +are supported. Support enquiries should be sent to support@oasislmf.org.

There are build instructions for Windows 64-bit executables.

-

Note that the dynamic random number option in the Windows build uses a deterministic seed due to a bug in the mingw compiler. We recommend the random number file option (gulcalc -r) should be used in Windows.

-

This issue will be handled in future releases by implementing the rdrand random number generator in all environments.

+

Note that the dynamic random number option in the Windows build uses +a deterministic seed due to a bug in the mingw compiler. We recommend +the random number file option (gulcalc -r) should be used in +Windows.

+

This issue will be handled in future releases by implementing the +rdrand random number generator in all environments.

Linux Installation

Pre-requisites

-

The g++ compiler build-essential, libtool, zlib1g-dev, autoconf, pkg-config on debian distros or 'Development Tools' and zlib-devel on red hat needs to be installed in Linux.

-

To enable Parquet format outputs (optional), version 7.0.0 of the Arrow Apache library is required. The recommended method is to build the library from source as follows;

-
$ mkdir build +

The g++ compiler build-essential, libtool, zlib1g-dev, autoconf, +pkg-config on debian distros or 'Development Tools' and zlib-devel on +red hat needs to be installed in Linux.

+

To enable Parquet format outputs (optional), version 7.0.0 of the +Arrow Apache library is required. The recommended method is to build the +library from source as follows;

+
$ mkdir build
 $ cd build
 $ git clone https://github.com/apache/arrow.git -b release-7.0.0
 $ mkdir -p arrow/cpp/build-release
 $ cd build/arrow/cpp/build-release
 $ cmake -DARROW_PARQUET=ON -DARROW_BUILD_STATIC=ON -DARROW_OPTIONAL_INSTALL=ON ..
 $ make -j$(nproc)
-$ make install
-
-

More information on Arrow Apache.

+$ make install
+

More +information on Arrow Apache.

Instructions

Copy ktools-[version].tar.gz onto your machine and untar.

-
$ tar -xvf ktools-[version].tar.gz -
-

Go into the ktools folder and autogen using the following command;

-
$ cd ktools-[version] -$ ./autogen.sh -
+
$ tar -xvf ktools-[version].tar.gz
+

Go into the ktools folder and autogen using the following +command;

+
$ cd ktools-[version]
+$ ./autogen.sh

Configure using the following command;

-
$ ./configure -
-

The configure script will attempt to find and link the appropriate Apache libraries to enable Parquet format output. This search for these libraries can be disabled manually with an extra flag:

-
$ ./configure --disable-parquet -
+
$ ./configure
+

The configure script will attempt to find and link the appropriate +Apache libraries to enable Parquet format output. This search for these +libraries can be disabled manually with an extra flag:

+
$ ./configure --disable-parquet

On OS X add an extra flag:

-
$ ./configure --enable-osx -
+
$ ./configure --enable-osx

Make using the following command;

-
$ make -
-

Next run the automated test to check the build and numerical results;

-
$ make check -
+
$ make
+

Next run the automated test to check the build and numerical +results;

+
$ make check

Finally, install the executables using the following command;

-
$ [sudo] make install -
-

The installation is complete. The executables are located in /usr/local/bin.

-

If installing the latest code from the git repository, clone the ktools repository onto your machine.

-

Go into the ktools folder and autogen using the following command;

-
$ cd ktools -$ ./autogen.sh -
+
$ [sudo] make install
+

The installation is complete. The executables are located in +/usr/local/bin.

+

If installing the latest code from the git repository, clone the +ktools repository onto your machine.

+

Go into the ktools folder and autogen using the following +command;

+
$ cd ktools
+$ ./autogen.sh

Follow the rest of the process as described above.

Cmake build - Experimental

-

Install Cmake from either system packages or cmake.org.

-
# create the build directory within ktools directory -$ mkdir build && cd build -$ ktools_source_dir=~/ktools -# Generate files and specify destination (here is in ./local/bin) -$ cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=~/.local $ktools_source_dir - -# Build -$ make all test - -# If all is OK, install to bin subdir of the specified install prefix -$ make install -
+

Install Cmake from either system packages or cmake.org.

+
# create the build directory within ktools directory
+$ mkdir build && cd build
+$ ktools_source_dir=~/ktools
+# Generate files and specify destination (here is in ./local/bin)
+$ cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=~/.local $ktools_source_dir
+
+# Build
+$ make all test
+
+# If all is OK, install to bin subdir of the specified install prefix
+$ make install

Windows Installation

-

Pre-requisites

-

MSYS2 64-bit is required for the Windows native build. MSYS2 is a Unix/Linux like development environment for building and distributing windows applications. -https://www.msys2.org/

+

Pre-requisites

+

MSYS2 64-bit is required for the Windows native build. MSYS2 is a +Unix/Linux like development environment for building and distributing +windows applications. https://www.msys2.org/

Download and run the set-up program for MSYS2 64-bit.

Open a MSYS2 terminal and perform the updates before continuing.

The following add-in packages are required;

@@ -454,44 +342,52 @@

Pre-requisites

  • mingw-w64-x86_64-toolchain
  • python
  • -

    These packages can be installed at the MSYS2 terminal command line.

    -
    $ pacman -S autoconf automake git libtool make mingw-w64-x86_64-toolchain python -
    -

    alt text

    -

    Instructions

    +

    These packages can be installed at the MSYS2 terminal command +line.

    +
    $ pacman -S autoconf automake git libtool make mingw-w64-x86_64-toolchain python
    +

    +

    Instructions

    Clone the github repository at the MSYS2 terminal command line

    -
    $ git clone https://github.com/OasisLMF/ktools.git -
    -

    Go into the ktools folder and run autogen using the following command;

    -
    $ cd ktools -$ ./autogen.sh -
    +
    $ git clone https://github.com/OasisLMF/ktools.git
    +

    Go into the ktools folder and run autogen using the following +command;

    +
    $ cd ktools
    +$ ./autogen.sh

    Configure using the following command;

    -
    $ ./configure -
    +
    $ ./configure

    Make using the following command;

    -
    $ make -
    -

    Next run the automated test to check the build and numerical results;

    -
    $ make check -
    +
    $ make
    +

    Next run the automated test to check the build and numerical +results;

    +
    $ make check

    Finally, install the executables using the following command;

    -
    $ make install -
    -

    The installation is complete. The executables are located in /usr/local/bin.

    +
    $ make install
    +

    The installation is complete. The executables are located in +/usr/local/bin.

    Usage

    -

    There is sample data and six example scripts which demonstrate how to invoke ktools in the /examples folder. These are written in python v2.

    -

    For example, to run the eltcalc_example script, go into the examples folder and run the following command (you must have python installed):

    -
    $ cd examples -$ python eltcalc_example.py -
    +

    There is sample data and six example scripts which demonstrate how to +invoke ktools in the /examples folder. These are written in python +v2.

    +

    For example, to run the eltcalc_example script, go into the examples +folder and run the following command (you must have python +installed):

    +
    $ cd examples
    +$ python eltcalc_example.py

    To build linux docker image do following command

    -
    docker build --file Dockerfile.ktools.alpine -t alpine-ktools . -
    +
    docker build --file Dockerfile.ktools.alpine -t alpine-ktools .

    Questions/problems?

    -

    Email support@oasislmf.org

    +

    Email support@oasislmf.org

    License

    The code in this project is licensed under BSD 3-clause license.

    - - - + + diff --git a/docs/html/RandomNumbers.html b/docs/html/RandomNumbers.html index 1cff1686..829fee60 100644 --- a/docs/html/RandomNumbers.html +++ b/docs/html/RandomNumbers.html @@ -1,431 +1,276 @@ - - - -RandomNumbers.md - - - - - - - - - - - - -

    alt text

    -

    Appendix A: Random numbers

    -

    Simple uniform random numbers are assigned to each event, group and sample number to sample ground up loss in the gulcalc process. A group is a collection of items which share the same group_id, and is the method of supporting spatial correlation in ground up loss sampling in Oasis and ktools.

    + + + + + + + RandomNumbers + + + +

    alt text

    +

    Appendix A: Random numbers +

    +

    Simple uniform random numbers are assigned to each event, group and +sample number to sample ground up loss in the gulcalc process. A group +is a collection of items which share the same group_id, and is the +method of supporting spatial correlation in ground up loss sampling in +Oasis and ktools.

    Correlation

    -

    Items (typically representing, in insurance terms, the underlying risk coverages) that are assigned the same group_id will use the same random number to sample damage for a given event and sample number. Items with different group_ids will be assigned independent random numbers. Therefore sampled damage is fully correlated within groups and fully independent between groups, where group is an abstract collection of items defined by the user.

    -

    The item_id, group_id data is provided by the user in the items input file (items.bin).

    +

    Items (typically representing, in insurance terms, the underlying +risk coverages) that are assigned the same group_id will use the same +random number to sample damage for a given event and sample number. +Items with different group_ids will be assigned independent random +numbers. Therefore sampled damage is fully correlated within groups and +fully independent between groups, where group is an abstract collection +of items defined by the user.

    +

    The item_id, group_id data is provided by the user in the items input +file (items.bin).

    Methodology

    -

    The method of assigning random numbers in gulcalc uses an random number index (ridx), an integer which is used as a position reference into a list of random numbers. S random numbers corresponding to the runtime number of samples are drawn from the list starting at the ridx position.

    -

    There are three options in ktools for choosing random numbers to apply in the sampling process.

    -

    1. Generate dynamically during the calculation

    +

    The method of assigning random numbers in gulcalc uses an random +number index (ridx), an integer which is used as a position reference +into a list of random numbers. S random numbers corresponding to the +runtime number of samples are drawn from the list starting at the ridx +position.

    +

    There are three options in ktools for choosing random numbers to +apply in the sampling process.

    +

    1. Generate +dynamically during the calculation

    Usage
    -

    Use -R{number of random numbers} as a parameter. Optionally you may use -s{seed} to make the random numbers repeatable.

    +

    Use -R{number of random numbers} as a parameter. Optionally you may +use -s{seed} to make the random numbers repeatable.

    Example
    -
    $ gulcalc -S00 -R1000000 -i - -
    -

    This will run 100 samples drawing from 1 million dynamically generated random numbers. They are simple uniform random numbers.

    -
    $ gulcalc -S00 -s123 -R1000000 -i - -
    -

    This will run 100 samples drawing from 1 million seeded random numbers (repeatable)

    +
    $ gulcalc -S00 -R1000000 -i -
    +

    This will run 100 samples drawing from 1 million dynamically +generated random numbers. They are simple uniform random numbers.

    +
    $ gulcalc -S00 -s123 -R1000000 -i -
    +

    This will run 100 samples drawing from 1 million seeded random +numbers (repeatable)

    Method
    -

    Random numbers are sampled dynamically using the Mersenne twister psuedo random number generator (the default RNG of the C++ v11 compiler). -A sparse array capable of holding R random numbers is allocated to each event. The ridx is generated from the group_id and number of samples S using the following modulus function;

    +

    Random numbers are sampled dynamically using the Mersenne twister +psuedo random number generator (the default RNG of the C++ v11 +compiler). A sparse array capable of holding R random numbers is +allocated to each event. The ridx is generated from the group_id and +number of samples S using the following modulus function;

    ridx= mod(group_id x P1, R)

    -

    This formula pseudo-randomly assigns ridx indexes to each group_id between 0 and 999,999.

    -

    As a ridx is sampled, the section in the array starting at the ridx position of length S is populated with random numbers unless they have already been populated, in which case the existing random numbers are re-used.

    -

    The array is cleared for the next event and a new set of random numbers is generated.

    -

    2. Use numbers from random number file

    -
    Usage
    +

    This formula pseudo-randomly assigns ridx indexes to each group_id +between 0 and 999,999.

    +

    As a ridx is sampled, the section in the array starting at the ridx +position of length S is populated with random numbers unless they have +already been populated, in which case the existing random numbers are +re-used.

    +

    The array is cleared for the next event and a new set of random +numbers is generated.

    +

    2. Use numbers from +random number file

    +
    Usage

    Use -r as a parameter

    -
    Example
    -
    $ gulcalc -S100 -r -i - -
    -

    This will run 100 samples using random numbers from file random.bin in the static sub-directory.

    -
    Method
    -

    The random number file(s) is read into memory at the start of the gulcalc process.

    -

    The ridx is generated from the sample index (sidx), event_id and group_id using the following modulus function;

    +
    Example
    +
    $ gulcalc -S100 -r -i -
    +

    This will run 100 samples using random numbers from file random.bin +in the static sub-directory.

    +
    Method
    +

    The random number file(s) is read into memory at the start of the +gulcalc process.

    +

    The ridx is generated from the sample index (sidx), event_id and +group_id using the following modulus function;

    ridx= sidx + mod(group_id x P1 x P3 + event_id x P2, R)

    -

    This formula pseudo-randomly assigns a starting position index to each event_id and group_id combo between 0 and R-1, and then S random numbers are drawn by incrementing the starting position by the sidx.

    -

    3. Generate automatically seeded random numbers (no buffer)

    -
    Usage
    +

    This formula pseudo-randomly assigns a starting position index to +each event_id and group_id combo between 0 and R-1, and then S random +numbers are drawn by incrementing the starting position by the sidx.

    +

    3. +Generate automatically seeded random numbers (no buffer)

    +
    Usage

    Default option

    -
    Example
    -
    $ gulcalc -S100 -i - -
    -

    This option will produce repeatable random numbers seeded from a combination of the event_id and group_id. The difference between this option and method 1 with the fixed seed is that there is no limit on the number of random numbers generated, and you do not need to make a decision on the buffer size. This will impact performance for large analyses.

    -
    Method
    -

    For each event_id and group_id, the seed is calculated as follows;

    -

    s1 = mod(group_id * 1543270363, 2147483648);
    -s2 = mod(event_id * 1943272559, 2147483648); -seed = mod(s1 + s2 , 2147483648)

    +
    Example
    +
    $ gulcalc -S100 -i -
    +

    This option will produce repeatable random numbers seeded from a +combination of the event_id and group_id. The difference between this +option and method 1 with the fixed seed is that there is no limit on the +number of random numbers generated, and you do not need to make a +decision on the buffer size. This will impact performance for large +analyses.

    +
    Method
    +

    For each event_id and group_id, the seed is calculated as +follows;

    +

    s1 = mod(group_id * 1543270363, 2147483648);
    +s2 = mod(event_id * 1943272559, 2147483648); seed = mod(s1 + s2 , +2147483648)

    Return to top

    -

    Go to Appendix B FM Profiles

    -

    Back to Contents

    - - - +

    Go to Appendix B FM Profiles

    +

    Back to Contents

    + + diff --git a/docs/html/ReferenceModelOverview.html b/docs/html/ReferenceModelOverview.html index 6e179f2f..18d655c3 100644 --- a/docs/html/ReferenceModelOverview.html +++ b/docs/html/ReferenceModelOverview.html @@ -1,415 +1,298 @@ - - - -ReferenceModelOverview.md - - - - - - - - - - - - -

    alt text

    -

    4. Reference Model Overview

    -

    This section provides an overview of the reference model, which is an implementation of each of the components in the framework.

    -

    There are five sub-sections which cover the usage and internal processes of each of the reference components;

    + + + + + + + ReferenceModelOverview + + + +

    alt text

    +

    4. Reference Model Overview +

    +

    This section provides an overview of the reference model, which is an +implementation of each of the components in the framework.

    +

    There are five sub-sections which cover the usage and internal +processes of each of the reference components;

    -

    The set of core components provided in this release is as follows;

    +

    The set of core +components provided in this release is as follows;

    -

    The standard input and standard output data streams for the core components are covered in the Specification.

    -

    Figure 1 shows the core components workflow and the required data input files.

    -
    Figure 1. Core components workflow and required data
    -

    alt text

    -

    The model static data for the core workflow, shown as red source files, are the event footprint, vulnerability, damage bin dictionary and random number file. These are stored in the 'static' sub-directory of the working folder.

    -

    The user / analysis input data for the core workflow, shown as blue source files, are the events, items, coverages, fm programme, fm policytc, fm profile, fm xref, fm summary xref and gul summary xref files. These are stored in the 'input' sub-directory of the working folder.

    -

    These are all Oasis kernel format data objects with prescribed formats. Note that the events are a user input rather than a static input because the user could choose to run a subset of the full list of events, or even just one event. Usually, though, the whole event set will be run.

    -

    The output components are various implementations of outputcalc, as described in general terms in the Specification. The results are written directly into csv file as there is no downstream processing.

    +

    The standard input and standard output data streams for the core +components are covered in the Specification.

    +

    Figure 1 shows the core components workflow and the required data +input files.

    +
    Figure 1. +Core components workflow and required data
    +

    alt text

    +

    The model static data for the core workflow, shown +as red source files, are the event footprint, vulnerability, damage bin +dictionary and random number file. These are stored in the +'static' sub-directory of the working folder.

    +

    The user / analysis input data for the core +workflow, shown as blue source files, are the events, items, coverages, +fm programme, fm policytc, fm profile, fm xref, fm summary xref and gul +summary xref files. These are stored in the 'input' +sub-directory of the working folder.

    +

    These are all Oasis kernel format data objects with prescribed +formats. Note that the events are a user input rather than a static +input because the user could choose to run a subset of the full list of +events, or even just one event. Usually, though, the whole event set +will be run.

    +

    The output +components are various implementations of outputcalc, as +described in general terms in the Specification. The results are written +directly into csv file as there is no downstream processing.

    -

    The files required for the output components are shown in Figure 2.

    -
    Figure 2. Output workflows and required data
    -

    alt text

    -

    The data conversion components section covers the formats of all of the required data files and explains how to convert data in csv format into binary format, and vice versa.

    -

    The stream conversion components section explains how to convert the binary data stream output to csv, plus how to convert gulcalc data in csv format into binary format. These components are useful when working with individual components at a more detailed level.

    -

    The validation components section explains how to use the validation components to check the validity of the static and input files in csv format, before they are converted to binary format. There are both validation checks on individual files and cross checks for consistency across files.

    -

    The version of the installed components can be found by using the command line parameter -v. For example;

    -
    $ gulcalc -v -gulcalc : version: 3.0.7 -
    +

    The files required for the output components are shown in Figure +2.

    +
    Figure 2. Output +workflows and required data
    +

    alt text

    +

    The data conversion +components section covers the formats of all of the +required data files and explains how to convert data in csv format into +binary format, and vice versa.

    +

    The stream conversion +components section explains how to convert the binary data +stream output to csv, plus how to convert gulcalc data in csv format +into binary format. These components are useful when working with +individual components at a more detailed level.

    +

    The validation +components section explains how to use the validation +components to check the validity of the static and input files in csv +format, before they are converted to binary format. There are both +validation checks on individual files and cross checks for consistency +across files.

    +

    The version of the installed components can be found by using the +command line parameter -v. For example;

    +
    $ gulcalc -v
    +gulcalc : version: 3.0.7

    Component usage guidance is available using the parameter -h

    -
    $ fmcalc -h +
    $ fmcalc -h
     -a set allocrule (default none)
     -M max level (optional)
     -p inputpath (relative or full path)
    @@ -417,12 +300,12 @@ 
    Figure 2. Output workflows -O Alloc rule2 optimization off -d debug -v version --h help -
    -

    The components have additional command line parameters depending on their particular function. These are described in detail in the following pages.

    +-h help
    +

    The components have additional command line parameters depending on +their particular function. These are described in detail in the +following pages.

    Return to top

    -

    Go to 4.1 Core Components section

    -

    Back to Contents

    - - - +

    Go to 4.1 Core Components section

    +

    Back to Contents

    + + diff --git a/docs/html/Specification.html b/docs/html/Specification.html index 8b312e52..688daebe 100644 --- a/docs/html/Specification.html +++ b/docs/html/Specification.html @@ -1,374 +1,180 @@ - - - -Specification.md - - - - - - - - - - - - -

    alt text

    -

    3. Specification

    + + + + + + + Specification + + + +

    alt text

    +

    3. Specification +

    Introduction

    -

    This section specifies the data stream structures and core components in the in-memory kernel.

    +

    This section specifies the data stream structures and core components +in the in-memory kernel.

    The data stream structures are;

    -

    The stream data structures have been designed to minimise the volume flowing through the pipeline, using data packet 'headers' to remove redundant data. For example, indexes which are common to a block of data are defined as a header record and then only the variable data records that are relevant to the header key are part of the data stream. The names of the data fields given below are unimportant, only their position in the data stream is important in order to perform the calculations defined in the program.

    +

    The stream data structures have been designed to minimise the volume +flowing through the pipeline, using data packet 'headers' to remove +redundant data. For example, indexes which are common to a block of data +are defined as a header record and then only the variable data records +that are relevant to the header key are part of the data stream. The +names of the data fields given below are unimportant, only their +position in the data stream is important in order to perform the +calculations defined in the program.

    The components are;

    -

    The components have a standard input (stdin) and/or output (stdout) data stream structure. eve is the stream-initiating component which only has a standard output stream, whereas "outputcalc" (a generic name representing an extendible family of output calculation components) is a stream-terminating component with only a standard input stream.

    -

    An implementation of each of the above components is provided in the Reference Model, where usage instructions and command line parameters are provided. A functional overview is given below.

    +

    The components have a standard input (stdin) and/or output (stdout) +data stream structure. eve is the stream-initiating component which only +has a standard output stream, whereas "outputcalc" (a generic name +representing an extendible family of output calculation components) is a +stream-terminating component with only a standard input stream.

    +

    An implementation of each of the above components is provided in the +Reference Model, where usage +instructions and command line parameters are provided. A functional +overview is given below.

    Stream types

    -

    The architecture supports multiple stream types. Therefore a developer can define a new type of data stream within the framework by specifying a unique stream_id of the stdout of one or more of the components, or even write a new component which performs an intermediate calculation between the existing components.

    -

    The stream_id is the first 4 byte header of the stdout streams. The higher byte is reserved to identify the type of stream, and the 2nd to 4th bytes hold the identifier of the stream. This is used for validation of pipeline commands to report errors if the components are not being used in the correct order.

    +

    The architecture supports multiple stream types. Therefore a +developer can define a new type of data stream within the framework by +specifying a unique stream_id of the stdout of one or more of the +components, or even write a new component which performs an intermediate +calculation between the existing components.

    +

    The stream_id is the first 4 byte header of the stdout streams. The +higher byte is reserved to identify the type of stream, and the 2nd to +4th bytes hold the identifier of the stream. This is used for validation +of pipeline commands to report errors if the components are not being +used in the correct order.

    The current reserved values are as follows;

    Higher byte;

    - - + + - - + + - - + + - - + + - - + +
    Byte 1Stream nameByte 1Stream name
    0cdf0cdf
    1gulcalc (deprecated)1gulcalc (deprecated)
    2loss2loss
    3summary3summary

    Reserved stream_ids;

    +++++ - + - + - + - + - + - + - + - + - + - + - + - +
    Byte 1Byte 1 Bytes 2-4DescriptionDescription
    00 1cdf - Oasis format effective damageability CDF outputcdf - Oasis format effective damageability +CDF output
    11 1gulcalc - Oasis format item level ground up loss sample output (deprecated)gulcalc - Oasis format item level ground +up loss sample output (deprecated)
    11 2gulcalc - Oasis format coverage level ground up loss sample output (deprecated)gulcalc - Oasis format coverage level +ground up loss sample output (deprecated)
    22 1loss - Oasis format loss sample output (any loss perspective)loss - Oasis format loss sample output +(any loss perspective)
    33 1summary - Oasis format summary level loss sample outputsummary - Oasis format summary level loss +sample output
    -

    The supported standard input and output streams of the reference model components are summarized here;

    +

    The supported standard input and output streams of the reference +model components are summarized here;

    ++++++ - - - - + + + + - - - - + + + + - - - - + + + + - - - - + + + + - - - - + + + + - - - - + + + +
    ComponentStandard inputStandard outputStream option parametersComponentStandard inputStandard outputStream option parameters
    getmodelnone0/1 cdfnonegetmodelnone0/1 cdfnone
    gulcalc0/1 cdf2/1 loss-i -a{}gulcalc0/1 cdf2/1 loss-i -a{}
    fmcalc2/1 loss2/1 lossnonefmcalc2/1 loss2/1 lossnone
    summarycalc2/1 loss3/1 summary-i input from gulcalc, -f input from fmcalcsummarycalc2/1 loss3/1 summary-i input from gulcalc, -f input from +fmcalc
    outputcalc3/1 summarynonenoneoutputcalc3/1 summarynonenone
    @@ -504,92 +350,116 @@

    Stream structure

    cdf stream

    Stream header packet structure

    +++++++ - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    stream_idstream_id int 1/3Identifier of the data stream type.0/1Identifier of the data stream type.0/1

    Data header packet structure

    +++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    event_idevent_id int 4Oasis event_id4545Oasis event_id4545
    areaperil_idareaperil_id int 4Oasis areaperil_id345456Oasis areaperil_id345456
    vulnerability_idvulnerability_id int 4Oasis vulnerability_id345Oasis vulnerability_id345
    no_of_binsno_of_bins int 4Number of records (bins) in the data package20Number of records (bins) in the data +package20

    Data packet structure (record repeated no_of_bin times)

    +++++++ - + - - + + - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    prob_toprob_to float 4The cumulative probability at the upper damage bin threshold0.765The cumulative probability at the upper +damage bin threshold0.765
    bin_meanbin_mean float 4The conditional mean of the damage bin0.45The conditional mean of the damage +bin0.45
    @@ -598,241 +468,299 @@

    cdf stream

    loss stream

    Stream header packet structure

    +++++++ - + - - + + - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    stream_idstream_id int 1/3Identifier of the data stream type.2/1Identifier of the data stream type.2/1
    no_of_samplesno_of_samples int 4Number of samples100Number of samples100

    Data header packet structure

    - +
    +++++++ - + - - + + - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    event_idevent_id int 4Oasis event_id4545Oasis event_id4545
    item_id /output_iditem_id /output_id int 4Oasis item_id (gulcalc) or output_id (fmcalc)300Oasis item_id (gulcalc) or output_id +(fmcalc)300

    Data packet structure

    +++++++ - + - - + + - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    sidxsidx int 4Sample index10Sample index10
    lossloss float 4The loss for the sample5625.675The loss for the sample5625.675
    -

    The data packet may be a variable length and so a sidx of 0 identifies the end of the data packet.

    +

    The data packet may be a variable length and so a sidx of 0 +identifies the end of the data packet.

    There are five values of sidx with special meaning as follows;

    +++++ - - + + - - + + - - + + - - + + - - + + - - + +
    sidxMeaningsidxMeaning Required / optional
    -5maximum loss-5maximum loss optional
    -4chance of loss-4chance of loss optional
    -3impacted exposure-3impacted exposure required
    -2numerical integration standard deviation loss-2numerical integration standard deviation +loss optional
    -1numerical integration mean loss-1numerical integration mean loss required
    -

    sidx -5 to -1 must come at the beginning of the data packet before the other samples in ascending order (-5 to -1).

    +

    sidx -5 to -1 must come at the beginning of the data packet before +the other samples in ascending order (-5 to -1).

    Return to top

    -

    summary stream

    +

    summary stream

    Stream header packet structure

    +++++++ - + - - + + - + - - + + - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    stream_idstream_id int 1/3Identifier of the data stream type.3/1Identifier of the data stream type.3/1
    no_of_samplesno_of_samples int 4Number of samples100Number of samples100
    summary_setsummary_set int 4Identifier of the summary set2Identifier of the summary set2

    Data header packet structure

    +++++++ - + - - + + - + - - + + - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    event_idevent_id int 4Oasis event_id4545Oasis event_id4545
    summary_idsummary_id int 4Oasis summary_id300Oasis summary_id300
    exposure_valueexposure_value float 4Impacted exposure (sum of sidx -3 losses for summary_id)987878Impacted exposure (sum of sidx -3 losses +for summary_id)987878

    Data packet structure

    +++++++ - + - - + + - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    sidxsidx int 4Sample index10Sample index10
    lossloss float 4The loss for the sample5625.675The loss for the sample5625.675
    -

    The data packet may be a variable length and so a sidx of 0 identifies the end of the data packet.

    +

    The data packet may be a variable length and so a sidx of 0 +identifies the end of the data packet.

    The sidx -1 mean loss may be present (if non-zero)

    +++++ - - + + - - + + @@ -841,24 +769,52 @@

    summary stream Components

    eve

    -

    eve is an 'event emitter' and its job is to read a list of events from file and send out a subset of events as a binary data stream. It has no standard input and emits a list of event_ids, which are 4 byte integers.

    -

    eve is used to partition lists of events such that a workflow can be distributed across multiple processes.

    +

    eve is an 'event emitter' and its job is to read a list of events +from file and send out a subset of events as a binary data stream. It +has no standard input and emits a list of event_ids, which are 4 byte +integers.

    +

    eve is used to partition lists of events such that a workflow can be +distributed across multiple processes.

    getmodel

    -

    getmodel is the component which generates a stream of effective damageability cdfs for a given set of event_ids and the impacted exposed items on the basis of their areaperil_ids (location) and vulnerability_ids (damage function).

    +

    getmodel is the component which generates a stream of effective +damageability cdfs for a given set of event_ids and the impacted exposed +items on the basis of their areaperil_ids (location) and +vulnerability_ids (damage function).

    gulcalc

    -

    gulcalc is the component which calculates ground up loss. It takes the getmodel output as standard input and based on the sampling parameters specified, performs Monte Carlo sampling and numerical integration. The output is a stream of ground up loss samples in Oasis kernel format with random samples identified by positive sample indexes (sidx 1 and greater), and special meaning samples assigned to negative sample indexes.

    -

    gulcalc also supports the combining and back-allocation of losses arising from multiple subperils impacting the same coverage with some options.

    +

    gulcalc is the component which calculates ground up loss. It takes +the getmodel output as standard input and based on the sampling +parameters specified, performs Monte Carlo sampling and numerical +integration. The output is a stream of ground up loss samples in Oasis +kernel format with random samples identified by positive sample indexes +(sidx 1 and greater), and special meaning samples assigned to negative +sample indexes.

    +

    gulcalc also supports the combining and back-allocation of losses +arising from multiple subperils impacting the same coverage with some +options.

    fmcalc

    -

    fmcalc is the component which takes the loss stream as standard input and output and applies the policy terms and conditions to produce insured loss samples. fmcalc can be called recursively to perform multiple sequential applications of financial terms (e.g for inuring reinsurance following direct insurance). The output is a table of loss samples in Oasis kernel format, including the (re)insured loss for the numerical integration mean (sidx=-1), and the impacted exposure (sidx=-3).

    +

    fmcalc is the component which takes the loss stream as standard input +and output and applies the policy terms and conditions to produce +insured loss samples. fmcalc can be called recursively to perform +multiple sequential applications of financial terms (e.g for inuring +reinsurance following direct insurance). The output is a table of loss +samples in Oasis kernel format, including the (re)insured loss for the +numerical integration mean (sidx=-1), and the impacted exposure +(sidx=-3).

    summarycalc

    -

    summarycalc is a component which sums the sampled losses from either gulcalc or fmcalc to the users required level(s) for reporting results. This is a simple sum of the loss value by event_id, sidx and summary_id, where summary_id is a grouping of coverage_id or item_id for gulcalc or output_id for fmcalc defined in the user's input files.

    +

    summarycalc is a component which sums the sampled losses from either +gulcalc or fmcalc to the users required level(s) for reporting results. +This is a simple sum of the loss value by event_id, sidx and summary_id, +where summary_id is a grouping of coverage_id or item_id for gulcalc or +output_id for fmcalc defined in the user's input files.

    outputcalc

    -

    Outputcalc is a general term for an end-of-pipeline component which represents one of a potentially unlimited set of output components. Some examples are provided in the Reference Model. These are;

    +

    Outputcalc is a general term for an end-of-pipeline component which +represents one of a potentially unlimited set of output components. Some +examples are provided in the Reference Model. These are;

    -

    The output components generate results such as an event loss table or loss exceedance curve from the sampled output from summarycalc. The output is a results table in csv format or parquet format.

    +

    The output components generate results such as an event loss table or +loss exceedance curve from the sampled output from summarycalc. The +output is a results table in csv format or parquet format.

    Return to top

    -

    Go to 4. Reference model

    -

    Back to Contents

    - - - +

    Go to 4. Reference model

    +

    Back to Contents

    + + diff --git a/docs/html/StreamConversionComponents.html b/docs/html/StreamConversionComponents.html index b4ce07fe..ff675c32 100644 --- a/docs/html/StreamConversionComponents.html +++ b/docs/html/StreamConversionComponents.html @@ -1,835 +1,707 @@ - - - -StreamConversionComponents.md - - - - - - - - - - - - -

    alt text

    -

    4.4 Stream conversion components

    -

    The following components convert the binary output of each calculation component to csv format;

    + + + + + + + StreamConversionComponents + + + +

    alt text

    +

    4.5 Stream conversion +components

    +

    The following components convert the binary output of each +calculation component to csv format;

    -

    Additionally, the following component is provided to convert csv data into binary format;

    +

    Additionally, the following component is provided to convert csv data +into binary format;

    -

    Figure 1 shows the workflows for the binary stream to csv conversions.

    -
    Figure 1. Stream conversion workflows
    -

    alt text

    +

    Figure 1 shows the workflows for the binary stream to csv +conversions.

    +
    Figure 1. Stream +conversion workflows
    +

    Figure 2 shows the workflows for the gultobin component.

    Figure 2. gultobin workflows
    -

    alt text

    +

    cdftocsv

    -
    -

    A component which converts the getmodel output stream, or binary file with the same structure, to a csv file.

    +
    +

    A component which converts the getmodel output stream, or binary file +with the same structure, to a csv file.

    Standard input stream
    sidxMeaningsidxMeaning Required / optional
    -1numerical integration mean loss-1numerical integration mean loss optional
    - + - + - + - +
    Byte 1Byte 1 Bytes 2-4DescriptionDescription
    00 1cdfcdf

    A binary file of the same format can be piped into cdftocsv.

    Usage
    -
    $ [stdin component] | cdftocsv > [output].csv -$ cdftocsv < [stdin].bin > [output].csv -
    +
    $ [stdin component] | cdftocsv > [output].csv
    +$ cdftocsv < [stdin].bin > [output].csv
    Example
    -
    $ eve 1 1 | getmodel | cdftocsv > cdf.csv -$ cdftocsv < getmodel.bin > cdf.csv -
    +
    $ eve 1 1 | getmodel | cdftocsv > cdf.csv
    +$ cdftocsv < getmodel.bin > cdf.csv 
    Output

    Csv file with the following fields;

    +++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    event_idevent_id int 4Oasis event_id4545Oasis event_id4545
    areaperil_idareaperil_id int 4Oasis areaperil_id345456Oasis areaperil_id345456
    vulnerability_idvulnerability_id int 4Oasis vulnerability_id345Oasis vulnerability_id345
    bin_indexbin_index int 4Damage bin index20Damage bin index20
    prob_toprob_to float 4The cumulative probability at the upper damage bin threshold0.765The cumulative probability at the upper +damage bin threshold0.765
    bin_meanbin_mean float 4The conditional mean of the damage bin0.45The conditional mean of the damage +bin0.45

    Return to top

    gultocsv

    -
    -

    A component which converts the gulcalc item or coverage stream, or binary file with the same structure, to a csv file.

    -
    Standard input stream
    +
    +

    A component which converts the gulcalc item or coverage stream, or +binary file with the same structure, to a csv file.

    +
    Standard input stream
    - + - + - + - + - + - +
    Byte 1Byte 1 Bytes 2-4DescriptionDescription
    11 1gulcalc itemgulcalc item
    11 2gulcalc coveragegulcalc coverage

    A binary file of the same format can be piped into gultocsv.

    -
    Usage
    -
    $ [stdin component] | gultocsv > [output].csv -$ gultocsv < [stdin].bin > [output].csv -
    -
    Example
    -
    $ eve 1 1 | getmodel | gulcalc -r -S100 -c - | gultocsv > gulcalcc.csv -$ gultocsv < gulcalci.bin > gulcalci.csv -
    -
    Output
    +
    Usage
    +
    $ [stdin component] | gultocsv > [output].csv
    +$ gultocsv < [stdin].bin > [output].csv
    +
    Example
    +
    $ eve 1 1 | getmodel | gulcalc -r -S100 -c - | gultocsv > gulcalcc.csv
    +$ gultocsv < gulcalci.bin > gulcalci.csv 
    +
    Output

    Csv file with the following fields;

    gulcalc item stream 1/1

    +++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    event_idevent_id int 4Oasis event_id4545Oasis event_id4545
    item_iditem_id int 4Oasis item_id300Oasis item_id300
    sidxsidx int 4Sample index10Sample index10
    lossloss float 4The ground up loss value5675.675The ground up loss value5675.675

    gulcalc coverage stream 1/2

    +++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    event_idevent_id int 4Oasis event_id4545Oasis event_id4545
    coverage_idcoverage_id int 4Oasis coverage_id150Oasis coverage_id150
    sidxsidx int 4Sample index10Sample index10
    lossloss float 4The ground up loss value5675.675The ground up loss value5675.675

    Return to top

    fmtocsv

    -
    -

    A component which converts the fmcalc output stream, or binary file with the same structure, to a csv file.

    -
    Standard input stream
    +
    +

    A component which converts the fmcalc output stream, or binary file +with the same structure, to a csv file.

    +
    Standard input stream
    - + - + - + - +
    Byte 1Byte 1 Bytes 2-4DescriptionDescription
    22 1lossloss

    A binary file of the same format can be piped into fmtocsv.

    -
    Usage
    -
    $ [stdin component] | fmtocsv > [output].csv -$ fmtocsv < [stdin].bin > [output].csv -
    -
    Example
    -
    $ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | fmcalc | fmtocsv > fmcalc.csv -$ fmtocsv < fmcalc.bin > fmcalc.csv -
    -
    Output
    +
    Usage
    +
    $ [stdin component] | fmtocsv > [output].csv
    +$ fmtocsv < [stdin].bin > [output].csv
    +
    Example
    +
    $ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | fmcalc | fmtocsv > fmcalc.csv
    +$ fmtocsv < fmcalc.bin > fmcalc.csv 
    +
    Output

    Csv file with the following fields;

    +++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    event_idevent_id int 4Oasis event_id4545Oasis event_id4545
    output_idoutput_id int 4Oasis output_id5Oasis output_id5
    sidxsidx int 4Sample index10Sample index10
    lossloss float 4The insured loss value5375.675The insured loss value5375.675

    Return to top

    summarycalctocsv

    -
    -

    A component which converts the summarycalc output stream, or binary file with the same structure, to a csv file.

    -
    Standard input stream
    +
    +

    A component which converts the summarycalc output stream, or binary +file with the same structure, to a csv file.

    +
    Standard input stream
    - + - + - + - +
    Byte 1Byte 1 Bytes 2-4DescriptionDescription
    33 1summarysummary
    -

    A binary file of the same format can be piped into summarycalctocsv.

    -
    Usage
    -
    $ [stdin component] | summarycalctocsv > [output].csv -$ summarycalctocsv < [stdin].bin > [output].csv -
    -
    Example
    -
    $ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | fmcalc | summarycalc -f -1 - | summarycalctocsv > summarycalc.csv -$ summarycalctocsv < summarycalc.bin > summarycalc.csv -
    -
    Output
    +

    A binary file of the same format can be piped into +summarycalctocsv.

    +
    Usage
    +
    $ [stdin component] | summarycalctocsv > [output].csv
    +$ summarycalctocsv < [stdin].bin > [output].csv
    +
    Example
    +
    $ eve 1 1 | getmodel | gulcalc -r -S100 -a1 -i - | fmcalc | summarycalc -f -1 - | summarycalctocsv > summarycalc.csv
    +$ summarycalctocsv < summarycalc.bin > summarycalc.csv 
    +
    Output

    Csv file with the following fields;

    +++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    event_idevent_id int 4Oasis event_id4545Oasis event_id4545
    summary_idsummary_id int 4Oasis summary_id3Oasis summary_id3
    sidxsidx int 4Sample index10Sample index10
    lossloss float 4The insured loss value5375.675The insured loss value5375.675

    Return to top

    gultobin

    -
    -

    A component which converts gulcalc data in csv format into gulcalc binary item stream (1/1).

    +
    +

    A component which converts gulcalc data in csv format into gulcalc +binary item stream (1/1).

    Input file format
    +++++++ - + - - + + - + - - + + - + - - + + - + - - + + - + - - + +
    NameName Type BytesDescriptionExampleDescriptionExample
    event_idevent_id int 4Oasis event_id4545Oasis event_id4545
    item_iditem_id int 4Oasis item_id300Oasis item_id300
    sidxsidx int 4Sample index10Sample index10
    lossloss float 4The ground up loss value5675.675The ground up loss value5675.675
    Parameters
    -

    -S, the number of samples must be provided. This can be equal to or greater than maximum sample index value that appears in the csv data. --t, the stream type of either 1 for the deprecated item stream or 2 for the loss stream. This is an optional parameter with default value 2.

    -
    Usage
    -
    $ gultobin [parameters] < [input].csv | [stdin component] -$ gultobin [parameters] < [input].csv > [output].bin -
    -
    Example
    -
    $ gultobin -S100 < gulcalci.csv | fmcalc > fmcalc.bin +

    -S, the number of samples must be provided. This can be equal to or +greater than maximum sample index value that appears in the csv data. +-t, the stream type of either 1 for the deprecated item stream or 2 for +the loss stream. This is an optional parameter with default value 2.

    +
    Usage
    +
    $ gultobin [parameters] < [input].csv | [stdin component]
    +$ gultobin [parameters] < [input].csv > [output].bin
    +
    Example
    +
    $ gultobin -S100 < gulcalci.csv | fmcalc > fmcalc.bin
     $ gultobin -S100 < gulcalci.csv > gulcalci.bin
     $ gultobin -S100 -t1 < gulcalci.csv > gulcalci.bin
    -$ gultobin -S100 -t2 < gulcalci.csv > gulcalci.bin
    -
    +$ gultobin -S100 -t2 < gulcalci.csv > gulcalci.bin
    Standard output stream
    - + - + - + - + - + - +
    Byte 1Byte 1 Bytes 2-4DescriptionDescription
    11 1gulcalc itemgulcalc item
    22 1gulcalc lossgulcalc loss

    Return to top

    -

    Go to 4.5. Validation Components

    -

    Back to Contents

    - - - +

    Go to 4.6. Validation +Components

    +

    Back to Contents

    + + diff --git a/docs/html/ValidationComponents.html b/docs/html/ValidationComponents.html index 92a7f91b..86e80c07 100644 --- a/docs/html/ValidationComponents.html +++ b/docs/html/ValidationComponents.html @@ -1,383 +1,197 @@ - - - -ValidationComponents.md - - - - - - - - - - - - -

    alt text

    -

    Validation components

    + + + + + + + ValidationComponents + + + +

    alt text

    +

    4.6 Validation components +

    The following components run validity checks on csv format files:

    Model data files

    Oasis input files

    Model data files

    @@ -396,15 +210,15 @@

    validatedamagebin

  • Deprecated interval_type column included.
  • Interpolation lies within range but not in the bin centre.
  • -

    The checks can be performed on damage_bin_dict.csv from the command line:

    -
    $ validatedamagebin < damage_bin_dict.csv -
    -

    The checks are also performed by default when converting damage bin dictionary files from csv to binary format:

    -
    $ damagebintobin < damage_bin_dict.csv > damage_bin_dict.bin +

    The checks can be performed on damage_bin_dict.csv from +the command line:

    +
    $ validatedamagebin < damage_bin_dict.csv
    +

    The checks are also performed by default when converting damage bin +dictionary files from csv to binary format:

    +
    $ damagebintobin < damage_bin_dict.csv > damage_bin_dict.bin
     
    -# Suppress validation checks with -N argument
    -$ damagebintobin -N < damage_bin_dict.csv > damage_bin_dict.bin
    -
    +# Suppress vaidation checks with -N argument +$ damagebintobin -N < damage_bin_dict.csv > damage_bin_dict.bin

    validatefootprint

    The following checks are performed on the event footprint:

    @@ -413,68 +227,91 @@

    validatefootprint

  • Total probability for each event-areaperil combination is 1.
  • Event IDs listed in ascending order.
  • For each event ID, areaperils IDs listed in ascending order.
  • -
  • No duplicate intensity bin IDs for each event-areaperil combination.
  • +
  • No duplicate intensity bin IDs for each event-areaperil +combination.
  • -

    Should all checks pass, the maximum value of intensity_bin_index is given, which is a required input for footprinttobin.

    -

    The checks can be performed on footprint.csv from the command line:

    -
    $ validatefootprint < footprint.csv -
    -

    The checks are also performed by default when converting footprint files from csv to binary format:

    -
    $ footprinttobin -i {number of intensity bins} < footprint.csv +

    Should all checks pass, the maximum value of +intensity_bin_index is given, which is a required input for +footprinttobin.

    +

    The checks can be performed on footprint.csv from the +command line:

    +
    $ validatefootprint < footprint.csv
    +

    The checks are also performed by default when converting footprint +files from csv to binary format:

    +
    $ footprinttobin -i {number of intensity bins} < footprint.csv
     
     # Suppress validation checks with -N argument
    -$ footprinttobin -i {number of intensity bins} -N < footprint.csv
    -
    +$ footprinttobin -i {number of intensity bins} -N < footprint.csv

    validatevulnerability

    The following checks are performed on the vulnerability data:

    -

    Should all checks pass, the maximum value of damage_bin_id is given, which is a required input for vulnerabilitytobin.

    -

    The checks can be performed on vulnerability.csv from the command line:

    -
    $ validatevulnerability < vulnerability.csv -
    -

    The checks are also performed by default when converting vulnerability files from csv to binary format:

    -
    $ vulnerabilitytobin -d {number of damage bins} < vulnerability.csv > vulnerability.bin +

    Should all checks pass, the maximum value of +damage_bin_id is given, which is a required input for +vulnerabilitytobin.

    +

    The checks can be performed on vulnerability.csv from +the command line:

    +
    $ validatevulnerability < vulnerability.csv
    +

    The checks are also performed by default when converting +vulnerability files from csv to binary format:

    +
    $ vulnerabilitytobin -d {number of damage bins} < vulnerability.csv > vulnerability.bin
     
     # Suppress validation checks with -N argument
    -$ vulnerabilitytobin -d {number of damage bins} -N < vulnerability.csv > vulnerability.bin
    -
    +$ vulnerabilitytobin -d {number of damage bins} -N < vulnerability.csv > vulnerability.bin

    crossvalidation

    -

    The following checks are performed across the damage bin dictionary, event footprint and vulnerability data:

    +

    The following checks are performed across the damage bin dictionary, +event footprint and vulnerability data:

    -

    The checks can be performed on damage_bin_dict.csv, footprint.csv and vulnerability.csv from the command line:

    -
    $ crossvalidation -d damage_bin_dict.csv -f footprint.csv -s vulnerability.csv -
    +

    The checks can be performed on damage_bin_dict.csv, +footprint.csv and vulnerability.csv from the +command line:

    +
    $ crossvalidation -d damage_bin_dict.csv -f footprint.csv -s vulnerability.csv

    Input oasis files

    validateoasisfiles

    -

    The following checks are performed across the coverages, items, fm policytc, fm programme and fm profile data:

    +

    The following checks are performed across the coverages, items, fm +policytc, fm programme and fm profile data:

    -

    The checks can be performed on coverages.csv, items.csv, fm_policytc.csv, fm_programme.csv and fm_profile.csv from the command line, specifying the directory these files are located in:

    -
    $ validateoasisfiles -d path/to/output/directory -
    -

    The Ground Up Losses (GUL) flag g can be specified to only perform checks on items.csv and coverages.csv:

    -
    $ validateoasisfiles -g -d /path/to/output/directory -
    +

    The checks can be performed on coverages.csv, +items.csv, fm_policytc.csv, +fm_programme.csv and fm_profile.csv from the +command line, specifying the directory these files are located in:

    +
    $ validateoasisfiles -d path/to/output/directory
    +

    The Ground Up Losses (GUL) flag g can be specified to +only perform checks on items.csv and +coverages.csv:

    +
    $ validateoasisfiles -g -d /path/to/output/directory

    Return to top

    -

    Go to 5. Financial Module

    -

    Back to Contents

    - - - +

    Go to 5. Financial Module

    +

    Back to Contents

    + + diff --git a/docs/html/Workflows.html b/docs/html/Workflows.html index 4520683f..bfd055cc 100644 --- a/docs/html/Workflows.html +++ b/docs/html/Workflows.html @@ -1,457 +1,319 @@ - - - -Workflows.md - - - - - - - - - - - - -

    alt text

    -

    6. Workflows

    -

    ktools is capable of multiple output workflows. This brings much greater flexibility, but also more complexity for users of the toolkit.

    -

    This section presents some example workflows, starting with single output workflows and then moving onto more complex multi-output workflows. There are some python scripts provided which execute some of the illustrative workflows using the example data in the repository. It is assumed that workflows will generally be run across multiple processes, with the number of processes being specified by the user.

    -

    1. Portfolio summary level insured loss event loss table

    -

    In this example, the core workflow is run through fmcalc into summarycalc and then the losses are summarized by summary set 2, which is "portfolio" summary level. -This produces multiple output files when run with multiple processes, each containing a subset of the event set. The output files can be concatinated together at the end.

    -
    eve 1 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -2 - | eltcalc > elt_p1.csv -eve 2 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -2 - | eltcalc > elt_p2.csv -
    + + + + + + + Workflows + + + +

    alt text

    +

    6. Workflows

    +

    ktools is capable of multiple output workflows. This brings much +greater flexibility, but also more complexity for users of the +toolkit.

    +

    This section presents some example workflows, starting with single +output workflows and then moving onto more complex multi-output +workflows. There are some python scripts provided which execute some of +the illustrative workflows using the example data in the repository. It +is assumed that workflows will generally be run across multiple +processes, with the number of processes being specified by the user.

    +

    1. +Portfolio summary level insured loss event loss table

    +

    In this example, the core workflow is run through fmcalc into +summarycalc and then the losses are summarized by summary set 2, which +is "portfolio" summary level. This produces multiple output files when +run with multiple processes, each containing a subset of the event set. +The output files can be concatinated together at the end.

    +
    eve 1 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -2 - | eltcalc > elt_p1.csv
    +eve 2 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -2 - | eltcalc > elt_p2.csv
    Figure 1. eltcalc workflow
    -

    alt text

    -

    See example script eltcalc_example.py

    -
    -

    2. Portfolio summary level insured loss period loss table

    -

    This is very similar to the first example, except the summary samples are run through pltcalc instead. The output files can be concatinated together at the end.

    -
    eve 1 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -2 - | pltcalc > plt_p1.csv -eve 2 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -2 - | pltcalc > plt_p2.csv -
    +

    +

    See example script eltcalc_example.py ***

    +

    2. +Portfolio summary level insured loss period loss table

    +

    This is very similar to the first example, except the summary samples +are run through pltcalc instead. The output files can be concatinated +together at the end.

    +
    eve 1 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -2 - | pltcalc > plt_p1.csv
    +eve 2 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -2 - | pltcalc > plt_p2.csv
    Figure 2. pltcalc workflow
    -

    alt text

    -

    See example script pltcalc_example.py

    -
    -

    3. Portfolio summary level full uncertainty aggregate and occurrence loss exceedance curves

    -

    In this example, the summary samples are calculated as in the first two examples, but the results are output to the work folder. Until this stage the calculation is run over multiple processes. Then, in a single process, leccalc reads the summarycalc binaries from the work folder and computes two loss exceedance curves in a single process. Note that you can output all eight loss exceedance curve variants in a single leccalc command.

    -
    eve 1 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -2 - > work/summary2/p1.bin +

    +

    See example script pltcalc_example.py ***

    +

    3. +Portfolio summary level full uncertainty aggregate and occurrence loss +exceedance curves

    +

    In this example, the summary samples are calculated as in the first +two examples, but the results are output to the work folder. Until this +stage the calculation is run over multiple processes. Then, in a single +process, leccalc reads the summarycalc binaries from the work folder and +computes two loss exceedance curves in a single process. Note that you +can output all eight loss exceedance curve variants in a single leccalc +command.

    +
    eve 1 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -2 - > work/summary2/p1.bin
     eve 2 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -2 - > work/summary2/p1.bin
    -leccalc -Ksummary2 -F lec_full_uncertainty_agg.csv -f lec_full_uncertainty_occ.csv
    -
    +leccalc -Ksummary2 -F lec_full_uncertainty_agg.csv -f lec_full_uncertainty_occ.csv
    Figure 3. leccalc workflow
    -

    alt text

    -

    See example script leccalc_example.py

    -
    -

    4. Portfolio summary level average annual loss

    -

    Similarly to lec curves, the samples are run through to summarycalc, and the summarycalc binaries are output to the work folder. Until this stage the calculation is run over multiple processes. Then, in a single process, aalcalc reads the summarycalc binaries from the work folder and computes the aal output.

    -
    eve 1 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -2 work/summary2/p1.bin +

    +

    See example script leccalc_example.py ***

    +

    4. Portfolio +summary level average annual loss

    +

    Similarly to lec curves, the samples are run through to summarycalc, +and the summarycalc binaries are output to the work folder. Until this +stage the calculation is run over multiple processes. Then, in a single +process, aalcalc reads the summarycalc binaries from the work folder and +computes the aal output.

    +
    eve 1 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -2 work/summary2/p1.bin
     eve 2 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -2 work/summary2/p2.bin
    -aalcalc -Ksummary2 > aal.csv
    -
    +aalcalc -Ksummary2 > aal.csv
    Figure 4. aalcalc workflow
    -

    alt text

    -

    See example script aalcalc_example.py

    -
    +

    +

    See example script aalcalc_example.py ***

    Multiple output workflows

    -

    5. Ground up and insured loss workflows

    -

    gulcalc can generate two output streams at once: item level samples to pipe into fmcalc, and coverage level samples to pipe into summarycalc. This means that outputs for both ground up loss and insured loss can be generated in one workflow.
    -This is done by writing one stream to a file or named pipe, while streaming the other to standard output down the pipeline.

    -
    eve 1 2 | getmodel | gulcalc -r -S100 -i gulcalci1.bin -c - | summarycalc -g -2 - | eltcalc > gul_elt_p1.csv +

    5. Ground up and insured +loss workflows

    +

    gulcalc can generate two output streams at once: item level samples +to pipe into fmcalc, and coverage level samples to pipe into +summarycalc. This means that outputs for both ground up loss and insured +loss can be generated in one workflow.
    +This is done by writing one stream to a file or named pipe, while +streaming the other to standard output down the pipeline.

    +
    eve 1 2 | getmodel | gulcalc -r -S100 -i gulcalci1.bin -c - | summarycalc -g -2 - | eltcalc > gul_elt_p1.csv
     eve 2 2 | getmodel | gulcalc -r -S100 -i gulcalci2.bin -c - | summarycalc -g -2 - | eltcalc > gul_elt_p2.csv
     fmcalc < gulcalci1.bin | summarycalc -f -2 - | eltcalc > fm_elt_p1.csv
    -fmcalc < gulcalci2.bin | summarycalc -f -2 - | eltcalc > fm_elt_p2.csv
    -
    -

    Note that the gulcalc item stream does not need to be written off to disk, as it can be sent to a 'named pipe', which keeps the data in-memory and kicks off a new process. This is easy to do in Linux (but harder in Windows).

    +fmcalc < gulcalci2.bin | summarycalc -f -2 - | eltcalc > fm_elt_p2.csv
    +

    Note that the gulcalc item stream does not need to be written off to +disk, as it can be sent to a 'named pipe', which keeps the data +in-memory and kicks off a new process. This is easy to do in Linux (but +harder in Windows).

    Figure 5 illustrates an example workflow.

    -
    Figure 5. Ground up and insured loss example workflow
    -

    alt text

    -

    See example script gulandfm_example.py

    -
    -

    6. Multiple summary level workflows

    -

    Summarycalc is capable of summarizing samples to up to 10 different user-defined levels for ground up loss and insured loss. This means that different outputs can be run on different summary levels. In this example, event loss tables for two different summary levels are generated.

    -
    eve 1 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -1 s1/p1.bin -2 s2/p1.bin +
    Figure 5. +Ground up and insured loss example workflow
    +

    alt text

    +

    See example script gulandfm_example.py +***

    +

    6. Multiple summary level +workflows

    +

    Summarycalc is capable of summarizing samples to up to 10 different +user-defined levels for ground up loss and insured loss. This means that +different outputs can be run on different summary levels. In this +example, event loss tables for two different summary levels are +generated.

    +
    eve 1 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -1 s1/p1.bin -2 s2/p1.bin
     eve 2 2 | getmodel | gulcalc -r -S100 -i - | fmcalc | summarycalc -f -1 s1/p2.bin -2 s2/p2.bin
     eltcalc < s1/p1.bin > elt_s1_p1.csv
     eltcalc < s1/p2.bin > elt_s1_p2.csv
     eltcalc < s2/p1.bin > elt_s2_p1.csv
    -eltcalc < s2/p2.bin > elt_s2_p2.csv
    -
    -

    Again, the summarycalc streams can be sent to named pipes rather than written off to disk.

    -

    Figure 6 illustrates multiple summary level streams, each of which can go to different output calculations.

    -
    Figure 6. Multiple summary level workflows
    -

    alt text

    +eltcalc < s2/p2.bin > elt_s2_p2.csv
    +

    Again, the summarycalc streams can be sent to named pipes rather than +written off to disk.

    +

    Figure 6 illustrates multiple summary level streams, each of which +can go to different output calculations.

    +
    Figure 6. Multiple +summary level workflows
    +

    alt text

    Financial Module workflows

    -

    The fmcalc component can be used recursively in order to apply multiple sets of policy terms and conditions, in order to support reinsurance. Figure 7 shows a simple example workflow of a direct insurance calculation followed by a reinsurance calculation.

    -
    eve 1 2 | getmodel | gulcalc -r -S100 -i - | fmcalc -p direct | fmcalc -p ri1 -n > fmcalc_1.bin -eve 2 2 | getmodel | gulcalc -r -S100 -i - | fmcalc -p direct | fmcalc -p ri1 -n > fmcalc_2.bin -
    -
    Figure 7. Multiple fmcalc workflow
    -

    alt text

    -

    Each call of fmcalc requires the same input files, so it is necessary to specify the location of the files for each call using the command line parameter -p and the relative folder path. Figure 8 demonstrates the required files for three consecutive calls of fmcalc.

    -
    Figure 8. Multiple fmcalc workflow
    -

    alt text

    -

    It is possible to generate all of the outputs for each call of fmcalc in the same workflow, enabling multiple financial perspective reports, as shown in Figure 9.

    -
    Figure 9. Multiple fmcalc outputs workflow
    -

    alt text

    +

    The fmcalc component can be used recursively in order to apply +multiple sets of policy terms and conditions, in order to support +reinsurance. Figure 7 shows a simple example workflow of a direct +insurance calculation followed by a reinsurance calculation.

    +
    eve 1 2 | getmodel | gulcalc -r -S100 -i - | fmcalc -p direct | fmcalc -p ri1 -n > fmcalc_1.bin
    +eve 2 2 | getmodel | gulcalc -r -S100 -i - | fmcalc -p direct | fmcalc -p ri1 -n > fmcalc_2.bin
    +
    Figure 7. Multiple fmcalc +workflow
    +

    +

    Each call of fmcalc requires the same input files, so it is necessary +to specify the location of the files for each call using the command +line parameter -p and the relative folder path. Figure 8 demonstrates +the required files for three consecutive calls of fmcalc.

    +
    Figure 8. Multiple fmcalc +workflow
    +

    alt text

    +

    It is possible to generate all of the outputs for each call of fmcalc +in the same workflow, enabling multiple financial perspective reports, +as shown in Figure 9.

    +
    Figure 9. Multiple +fmcalc outputs workflow
    +

    alt text

    Return to top

    -

    Go to Appendix A Random numbers

    -

    Back to Contents

    - - - +

    Go to Appendix A Random numbers

    +

    Back to Contents

    + + diff --git a/docs/html/fmprofiles.html b/docs/html/fmprofiles.html index 75051f42..1d57b537 100644 --- a/docs/html/fmprofiles.html +++ b/docs/html/fmprofiles.html @@ -1,535 +1,435 @@ - - - -fmprofiles.md - - - - - - - - - - - - -

    alt text

    -

    Appendix B: FM Profiles

    -

    This section specifies the attributes and rules for the following list of Financial module profiles.

    + + + + + + + fmprofiles + + + +

    alt text

    +

    Appendix B: FM Profiles +

    +

    This section specifies the attributes and rules for the following +list of Financial module profiles.

    ++++ - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + + + + + + + + +
    Profile descriptioncalcrule_idProfile descriptioncalcrule_id
    Do nothing (pass losses through)100Do nothing (pass losses through)100
    deductible and limit1deductible and limit1
    deductible with attachment, limit and share2deductible with attachment, limit and +share2
    franchise deductible and limit3franchise deductible and limit3
    deductible % TIV and limit4deductible % TIV and limit4
    deductible and limit % loss5deductible and limit % loss5
    deductible % TIV6deductible % TIV6
    deductible, minimum and maximum deductible, with limit7deductible, minimum and maximum +deductible, with limit7
    deductible and minimum deductible, with limit8deductible and minimum deductible, with +limit8
    limit with deductible % limit9limit with deductible % limit9
    deductible and maximum deductible10deductible and maximum deductible10
    deductible and minimum deductible11deductible and minimum deductible11
    deductible12deductible12
    deductible, minimum and maximum deductible13deductible, minimum and maximum +deductible13
    limit only14limit only14
    deductible and limit % loss15deductible and limit % loss15
    deductible % loss16deductible % loss16
    deductible % loss with attachment, limit and share17deductible % loss with attachment, limit +and share17
    deductible % tiv with attachment, limit and share18deductible % tiv with attachment, limit +and share18
    deductible % loss with min and/or max deductible19deductible % loss with min and/or max +deductible19
    reverse franchise deductible20reverse franchise deductible20
    deductible % tiv with min and max deductible21deductible % tiv with min and max +deductible21
    reinsurance % ceded, limit and % placed22reinsurance % ceded, limit and % +placed22
    reinsurance limit and % placed23reinsurance limit and % placed23
    reinsurance excess terms24reinsurance excess terms24
    reinsurance proportional terms25reinsurance proportional terms25
    deductible % loss with min and/or max deductible and limit26deductible % loss with min and/or max +deductible and limit26
    % tiv trigger and % tiv step payout with limit27% tiv trigger and % tiv step payout with +limit27
    % tiv trigger and % loss step payout28% tiv trigger and % loss step payout28
    % tiv trigger and % tiv step payout29% tiv trigger and % tiv step payout29
    % tiv trigger and % limit step payout30% tiv trigger and % limit step payout30
    % tiv trigger and monetary amount step payout31% tiv trigger and monetary amount step +payout31
    monetary amount trigger and % loss step payout with limit32monetary amount trigger and % loss step +payout with limit32
    deductible % loss with limit33deductible % loss with limit33
    deductible with attachment and share34deductible with attachment and share34
    deductible % loss with min and/or max deductible and limit % loss35deductible % loss with min and/or max +deductible and limit % loss35
    deductible with min and/or max deductible and limit % loss36deductible with min and/or max deductible +and limit % loss36
    % tiv trigger and % loss step payout with +limit37
    conditional coverage payouts based on +prior step payouts38
    - + @@ -545,12 +445,12 @@

    Appendix B: FM Profiles pe

    - + - + @@ -566,10 +466,10 @@

    Appendix B: FM Profiles

    - + - + @@ -585,10 +485,10 @@

    Appendix B: FM Profiles

    - + - + @@ -604,10 +504,10 @@

    Appendix B: FM Profiles

    - + - + @@ -623,10 +523,10 @@

    Appendix B: FM Profiles

    - + - + @@ -642,10 +542,10 @@

    Appendix B: FM Profiles

    - + - + @@ -661,10 +561,10 @@

    Appendix B: FM Profiles

    - + - + @@ -680,10 +580,10 @@

    Appendix B: FM Profiles

    - + - + @@ -699,10 +599,10 @@

    Appendix B: FM Profiles

    - + - + @@ -718,10 +618,10 @@

    Appendix B: FM Profiles

    - + - + @@ -737,10 +637,10 @@

    Appendix B: FM Profiles

    - + - + @@ -756,10 +656,10 @@

    Appendix B: FM Profiles

    - + - + @@ -775,10 +675,10 @@

    Appendix B: FM Profiles

    - + - + @@ -794,10 +694,10 @@

    Appendix B: FM Profiles

    - + - + @@ -813,10 +713,10 @@

    Appendix B: FM Profiles

    - + - + @@ -832,10 +732,10 @@

    Appendix B: FM Profiles

    - + - + @@ -851,10 +751,10 @@

    Appendix B: FM Profiles

    - + - + @@ -870,10 +770,10 @@

    Appendix B: FM Profiles

    - + - + @@ -889,10 +789,10 @@

    Appendix B: FM Profiles

    - + - + @@ -908,10 +808,10 @@

    Appendix B: FM Profiles

    - + - + @@ -927,10 +827,10 @@

    Appendix B: FM Profiles

    - + - + @@ -946,10 +846,10 @@

    Appendix B: FM Profiles

    - + - + @@ -965,10 +865,10 @@

    Appendix B: FM Profiles

    - + - + @@ -984,10 +884,10 @@

    Appendix B: FM Profiles

    - + - + @@ -1003,10 +903,10 @@

    Appendix B: FM Profiles

    - + - + @@ -1022,10 +922,10 @@

    Appendix B: FM Profiles

    - + - + @@ -1041,10 +941,10 @@

    Appendix B: FM Profiles

    - + - + @@ -1060,10 +960,10 @@

    Appendix B: FM Profiles

    - + - + @@ -1079,10 +979,10 @@

    Appendix B: FM Profiles

    - + - + @@ -1098,10 +998,10 @@

    Appendix B: FM Profiles

    - + - + @@ -1117,10 +1017,10 @@

    Appendix B: FM Profiles

    - + - + @@ -1136,10 +1036,10 @@

    Appendix B: FM Profiles

    - + - + @@ -1155,10 +1055,10 @@

    Appendix B: FM Profiles

    - + - + @@ -1174,10 +1074,10 @@

    Appendix B: FM Profiles

    - + - + @@ -1193,13 +1093,31 @@

    Appendix B: FM Profiles

    - + - + + + + + + + + + + + + + + + + + + + @@ -1212,10 +1130,11 @@

    Appendix B: FM Profiles

    - + + - + @@ -1231,15 +1150,29 @@

    Appendix B: FM Profiles

    - + - + + + + + + + + + + + + + + + @@ -1248,531 +1181,608 @@

    Appendix B: FM Profiles

    + + + - + + +
    calcrule_idcalcrule_id d1 d2 d3l2 sc1sc2sc2
    100100
    11 x
    22 x
    33 x
    44 x
    55 x
    66 x
    77 x x x
    88 x x
    99 x
    1010 x x
    1111 x x
    1212 x
    1313 x x x
    1414
    1515 x
    1616 x
    1717 x
    1818 x
    1919 x x x
    2020 x
    2121 x x x
    2222
    2323
    2424
    2525
    2626 x x x
    2727 x x xxx
    2828 x x xxx
    2929 x x xxx
    3030 x x xxx
    3131 x x xxx
    3232 x xxx
    3333 x
    3434xx x
    35xx x x
    3536 x x x
    3637xxx x x x xxx
    38 xxx xxx
    -

    The fields with an x are those which are required by the profile. The full names of the fields are as follows;

    +

    The fields with an x are those which are required by the profile. The +full names of the fields are as follows;

    - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + + - - + +
    Short nameProfile field nameShort nameProfile field name
    d1deductible_1d1deductible_1
    d2deductible_2d2deductible_2
    d3deductible_3d3deductible_3
    a1attachment_1a1attachment_1
    l1limit_1l1limit_1
    sh1share_1sh1share_1
    sh2share_2sh2share_2
    sh3share_3sh3share_3
    ststep_idststep_id
    tstrigger_starttstrigger_start
    tetrigger_endtetrigger_end
    pspayout_startpspayout_start
    pepayout_endpepayout_end
    l2limit_2l2limit_2
    sc1scale_1sc1scale_1
    sc2scale_2sc2scale_2
    -

    An allocation rule can be assigned to each call of fmcalc, which determines whether calculated losses should be back-allocated to the contributing items, and if so how. This is specified via the command line parameter -a.

    +

    An allocation rule can be assigned to each call of fmcalc, which +determines whether calculated losses should be back-allocated to the +contributing items, and if so how. This is specified via the command +line parameter -a.

    The choices are as follows;

    ++++ - - + + - - + + - - + + - - + + - - + +
    Allocrule descriptionallocrule_idAllocrule descriptionallocrule_id
    Don't back-allocate losses (default if no parameter supplied)0Don't back-allocate losses (default if no +parameter supplied)0
    Back allocate losses to items in proportion to input loss1Back allocate losses to items in +proportion to input loss1
    Back-allocate losses to items in proportion to prior level loss2Back-allocate losses to items in +proportion to prior level loss2
    Back-allocate losses to items in proportion to prior level loss (reinsurance)3Back-allocate losses to items in +proportion to prior level loss (reinsurance)3

    Effective deductibles

    -

    Often there are more than one hierarchal levels with deductibles, and there is a choice of methods of accumulation of deductibles through the hierarchy. Whenever a rule with a deductible is used in the loss calculation then it is accumulated through the calculation in an effective_deductible variable. The effective deductible is the smaller of the deductible amount and the loss.

    -

    All deductibles amounts calculated from the deductible_1 field are simply additive through the hierarchy.

    -

    Ay any level, the user can specify a calcrule using a minimum and/or maximum deductible which changes the way that effective deductibles are accumulated.

    -

    For a minimum deductible specified in calcrules using the deductible_2 field, the calculation increases the effective_deductible carried forward from the previous levels up to the minimum deductible if it is smaller.

    -

    For a maximum deductible specified in calcrules using the deductible_3 field, the calculation decreases the effective_deductible carried forward from the previous levels down to the maximum deductible if it is larger.

    -

    Adjustment of loss for prior level limits

    -

    Loss adjustments due to minimum and maximum deductibles may lead to breaching or falling short of prior level limits. For instance, an increase in loss due a policy maximum deductible being applied can lead to a breach of site limit that applied at the prior calculation level. Conversely, a decrease in loss due to policy minimum deductible can leave the loss falling short of a site limit applied at the prior calculation level. In these situations the prior level limits are carried through and reapplied in all calcrules that have minimum and/or maximum deductibles.

    +

    Often there are more than one hierarchal levels with deductibles, and +there is a choice of methods of accumulation of deductibles through the +hierarchy. Whenever a rule with a deductible is used in the loss +calculation then it is accumulated through the calculation in an +effective_deductible variable. The effective deductible +is the smaller of the deductible amount and the loss.

    +

    All deductibles amounts calculated from the deductible_1 field are +simply additive through the hierarchy.

    +

    Ay any level, the user can specify a calcrule using a minimum and/or +maximum deductible which changes the way that effective deductibles are +accumulated.

    +

    For a minimum deductible specified in calcrules using the +deductible_2 field, the calculation increases the effective_deductible +carried forward from the previous levels up to the minimum deductible if +it is smaller.

    +

    For a maximum deductible specified in calcrules using the +deductible_3 field, the calculation decreases the effective_deductible +carried forward from the previous levels down to the maximum deductible +if it is larger.

    +

    Adjustment of loss +for prior level limits

    +

    Loss adjustments due to minimum and maximum deductibles may lead to +breaching or falling short of prior level limits. For instance, an +increase in loss due a policy maximum deductible being applied can lead +to a breach of site limit that applied at the prior calculation level. +Conversely, a decrease in loss due to policy minimum deductible can +leave the loss falling short of a site limit applied at the prior +calculation level. In these situations the prior level limits are +carried through and reapplied in all calcrules that have minimum and/or +maximum deductibles.

    We introduce the following variables;

    -

    Scenarios for under and over limit

    -

    The over and under limit variables are initialised when there exist prior level limits. The possible cases are;

    +

    Scenarios for under and over +limit

    +

    The over and under limit variables are initialised when there exist +prior level limits. The possible cases are;

    ++++++ - + - + - + - + - + - + - + - + - + - +
    CaseCase Under limit Over limitMeaningMeaning
    11 0 0All prior level losses are exactly at their limitsAll prior level losses are exactly at +their limits
    22 0 >0Some prior level losses are over limit and none are under limitSome prior level losses are over limit and +none are under limit
    33 >0 0Some prior level losses are under limit and none are over limitSome prior level losses are under limit +and none are over limit
    44 >0 >0Some prior level losses are over limit and some are under limitSome prior level losses are over limit and +some are under limit

    When the loss delta is positive;

    When the loss delta is negative;

    -

    Current calculation level limits may also apply and these are used to update the over limit and under limit measures to carry through to the next level.

    +

    Current calculation level limits may also apply and these are used to +update the over limit and under limit measures to carry through to the +next level.

    Calcrules

    In the following notation;

    1. Deductible and limit

    - - + + - - + + - - + + - - + + - - + +
    AttributesExampleAttributesExample
    policytc_id1policytc_id1
    calcrule_id1calcrule_id1
    deductible_150000deductible_150000
    limit_1900000limit_1900000
    Calculation logic
    -
    loss = x.loss - deductible_1; -if (loss < 0) loss = 0; -if (loss > limit_1) loss = limit_1; -
    -

    2. Deductible, attachment, limit and share

    +
    loss = x.loss - deductible_1;
    +if (loss < 0) loss = 0;
    +if (loss > limit_1) loss = limit_1;
    +

    2. Deductible, +attachment, limit and share

    - - + + - - + + - - + + - - + + - - + + - - + + - - + +
    AttributesExampleAttributesExample
    policytc_id1policytc_id1
    calcrule_id2calcrule_id2
    deductible_170000deductible_170000
    attachment_10attachment_10
    limit_11000000limit_11000000
    share_10.1share_10.1
    -
    Calculation logic
    -
    loss = x.loss - deductible_1 -if (loss < 0) loss = 0; -if (loss > (attachment_1 + limit_1)) loss = limit_1; - else loss = loss - attachment_1; -if (loss < 0) loss = 0; -loss = loss * share_1; -
    -

    3. Franchise deductible and limit

    +
    Calculation logic
    +
    loss = x.loss - deductible_1
    +if (loss < 0) loss = 0;
    +if (loss > (attachment_1 + limit_1)) loss = limit_1;
    +    else loss = loss - attachment_1;
    +if (loss < 0) loss = 0;
    +loss = loss * share_1;
    +

    3. Franchise deductible and +limit

    - - + + - - + + - - + + - - + + - - + +
    AttributesExampleAttributesExample
    policytc_id1policytc_id1
    calcrule_id3calcrule_id3
    deductible_1100000deductible_1100000
    limit_11000000limit_11000000
    -
    Calculation logic
    -
    if (x.loss < deductible_1) loss = 0; - else loss = x.loss; -if (loss > limit_1) loss = limit_1; -
    -

    5. Deductible and limit as a proportion of loss

    +
    Calculation logic
    +
    if (x.loss < deductible_1) loss = 0;
    +    else loss = x.loss;
    +if (loss > limit_1) loss = limit_1;
    +

    5. Deductible +and limit as a proportion of loss

    - - + + - - + + - - + + - - + + - - + +
    AttributesExampleAttributesExample
    policytc_id1policytc_id1
    calcrule_id5calcrule_id5
    deductible_10.05deductible_10.05
    limit_10.3limit_10.3
    -
    Calculation logic
    -
    loss = x.loss - (x.loss * deductible_1); -if (loss > (x.loss * limit_1)) loss = x.loss * lim; -
    -

    9. Limit with deductible as a proportion of limit

    +
    Calculation logic
    +
    loss = x.loss - (x.loss * deductible_1);
    +if (loss > (x.loss * limit_1)) loss = x.loss * lim;
    +

    9. Limit with +deductible as a proportion of limit

    - - + + - - + + - - + + - - + + - - + +
    AttributesExampleAttributesExample
    policytc_id1policytc_id1
    calcrule_id9calcrule_id9
    deductible_10.05deductible_10.05
    limit_1100000limit_1100000
    -
    Calculation logic
    -
    loss = x.loss - (deductible_1 * limit_1); -if (loss < 0) loss = 0; -if (loss > limit_1) loss = limit_1; -
    +
    Calculation logic
    +
    loss = x.loss - (deductible_1 * limit_1);
    +if (loss < 0) loss = 0;
    +if (loss > limit_1) loss = limit_1;

    10. Maximum deductible

    -

    If the effective deductible carried forward from the previous level exceeds the maximum deductible, the effective deductible is decreased to the maximum deductible value

    +

    If the effective deductible carried forward from the previous level +exceeds the maximum deductible, the effective deductible is decreased to +the maximum deductible value

    - - + + - - + + - - + + - - + +
    AttributesExampleAttributesExample
    policytc_id1policytc_id1
    calcrule_id10calcrule_id10
    deductible_340000deductible_340000
    -
    Calculation logic
    -
    if (x.effective_deductible > deductible_3) { - loss = x.loss + x.effective_deductible - deductible_3; - if (loss < 0) loss = 0; - } -else { - loss = x.loss; - } -
    +
    Calculation logic
    +
    if (x.effective_deductible > deductible_3) { 
    +    loss = x.loss + x.effective_deductible - deductible_3;
    +    if (loss < 0) loss = 0;
    +    }
    +else {
    +    loss = x.loss;
    +     }

    11. Minimum deductible

    -

    If the effective deductible carried forward from the previous level is less than the minimum deductible, the deductible is increased to the total loss or the minimum deductible value, whichever is greater.

    +

    If the effective deductible carried forward from the previous level +is less than the minimum deductible, the deductible is increased to the +total loss or the minimum deductible value, whichever is greater.

    - - + + - - + + - - + + - - + +
    AttributesExampleAttributesExample
    policytc_id1policytc_id1
    calcrule_id11calcrule_id11
    deductible_270000deductible_270000
    -
    Calculation logic
    -
    if (x.effective_deductible < deductible_2) { - loss = x.loss + x.effective_deductible - deductible_2; - if (loss < 0) loss = 0; - } -else { - loss = x.loss; - } -
    +
    Calculation logic
    +
    if (x.effective_deductible < deductible_2) { 
    +    loss = x.loss + x.effective_deductible - deductible_2;
    +    if (loss < 0) loss = 0;
    +    }
    +else {
    +    loss = x.loss;
    +     }

    12. Deductible only

    - - + + - - + + - - + + - - + +
    AttributesExampleAttributesExample
    policytc_id1policytc_id1
    calcrule_id12calcrule_id12
    deductible_1100000deductible_1100000
    -
    Calculation logic
    -
    loss = x.loss - deductible_1; -if (loss < 0) loss = 0; -
    +
    Calculation logic
    +
    loss = x.loss - deductible_1;
    +if (loss < 0) loss = 0;

    14. Limit only

    - - + + - - + + - - + + - - + +
    AttributesExampleAttributesExample
    policytc_id1policytc_id1
    calcrule_id14calcrule_id14
    limit100000limit100000
    -
    Calculation logic
    -
    loss = x.loss; -if (loss > limit_1) loss = limit_1; -
    -

    15. Limit as a proportion of loss

    +
    Calculation logic
    +
    loss = x.loss;
    +if (loss > limit_1) loss = limit_1;
    +

    15. Limit as a proportion of +loss

    - - + + - - + + - - + + - - + +
    AttributesExampleAttributesExample
    policytc_id1policytc_id1
    calcrule_id15calcrule_id15
    limit_10.3limit_10.3
    -
    Calculation logic
    -
    loss = x.loss * limit_1; -
    -

    16. Deductible as a proportion of loss

    +
    Calculation logic
    +
    loss = x.loss * limit_1;
    +

    16. Deductible as a +proportion of loss

    - - + + - - + + - - + + - - + +
    AttributesExampleAttributesExample
    policytc_id1policytc_id1
    calcrule_id16calcrule_id16
    deductible_10.05deductible_10.05
    -
    Calculation logic
    -
    loss = x.loss - (x.loss * deductible_1); -if (loss < 0) loss = 0; -
    +
    Calculation logic
    +
    loss = x.loss - (x.loss * deductible_1);
    +if (loss < 0) loss = 0;

    Return to top

    -

    Go to Appendix C Multi-peril model support

    -

    Back to Contents

    - - - +

    Go to Appendix C Multi-peril model +support

    +

    Back to Contents

    + + diff --git a/docs/md/DataConversionComponents.md b/docs/md/DataConversionComponents.md index 28e179e6..eeb29d6b 100644 --- a/docs/md/DataConversionComponents.md +++ b/docs/md/DataConversionComponents.md @@ -4,15 +4,20 @@ The following components convert input data in csv format to the binary format required by the calculation components in the reference model; **Static data** +* **[aggregatevulnerabilitytobin](#aggregatevulnerability)** converts the aggregate vulnerability data. * **[damagebintobin](#damagebins)** converts the damage bin dictionary. * **[footprinttobin](#footprint)** converts the event footprint. +* **[lossfactorstobin](#lossfactors)** converts the lossfactors data. * **[randtobin](#rand)** converts a list of random numbers. * **[vulnerabilitytobin](#vulnerability)** converts the vulnerability data. +* **[weightstobin](#weights)** converts the weights data. A reference [intensity bin dictionary](#intensitybins) csv should also exist, although there is no conversion component for this file because it is not needed for calculation purposes. **Input data** +* **[amplificationtobin](#amplifications)** converts the amplifications data. * **[coveragetobin](#coverages)** converts the coverages data. +* **[ensembletobin](#ensemble)** converts the ensemble data. * **[evetobin](#events)** converts a list of event_ids. * **[itemtobin](#items)** converts the items data. * **[gulsummaryxreftobin](#gulsummaryxref)** converts the gul summary xref data. @@ -24,19 +29,25 @@ A reference [intensity bin dictionary](#intensitybins) csv should also exist, al * **[occurrencetobin](#occurrence)** converts the event occurrence data. * **[returnperiodtobin](#returnperiod)** converts a list of return periods. * **[periodstobin](#periods)** converts a list of weighted periods (optional). +* **[quantiletobin](#quantile)** converts a list of quantiles (optional). These components are intended to allow users to generate the required input binaries from csv independently of the original data store and technical environment. All that needs to be done is first generate the csv files from the data store (SQL Server database, etc). The following components convert the binary input data required by the calculation components in the reference model into csv format; **Static data** +* **[aggregatevulnerabilitytocsv](#aggregatevulnerability)** converts the aggregate vulnerability data. * **[damagebintocsv](#damagebins)** converts the damage bin dictionary. * **[footprinttocsv](#footprint)** converts the event footprint. +* **[lossfactorstocsv](#lossfactors)** converts the lossfactors data. * **[randtocsv](#rand)** converts a list of random numbers. * **[vulnerabilitytocsv](#vulnerability)** converts the vulnerability data. +* **[weightstocsv](#weights)** converts the weights data. **Input data** +* **[amplificationtocsv](#amplifications)** converts the amplifications data. * **[coveragetocsv](#coverages)** converts the coverages data. +* **[ensembletocsv](#ensemble)** converts the ensemble data. * **[evetocsv](#events)** converts a list of event_ids. * **[itemtocsv](#items)** converts the items data. * **[gulsummaryxreftocsv](#gulsummaryxref)** converts the gul summary xref data. @@ -48,11 +59,43 @@ The following components convert the binary input data required by the calculati * **[occurrencetocsv](#occurrence)** converts the event occurrence data. * **[returnperiodtocsv](#returnperiod)** converts a list of return periods. * **[periodstocsv](#returnperiod)** converts a list of weighted periods (optional). +* **[quantiletocsv](#quantile)** converts a list of quantiles (optional). These components are provided for the convenience of viewing the data and debugging. ## Static data + +### aggregate vulnerability +*** +The aggregate vulnerability file is required for the gulmc component. It contains the conditional distributions of damage for each intensity bin and for each vulnerability_id. This file must have the following location and filename; + +* static/aggregate_vulnerability.bin + +##### File format + +The csv file should contain the following fields and include a header row. + + +| Name | Type | Bytes | Description | Example | +|:-------------------------------|--------|--------| :---------------------------------------------|------------:| +| aggregate_vulnerability_id | int | 4 | Oasis vulnerability_id | 45 | +| vulnerability_id | int | 4 | Oasis vulnerability_id | 45 | + +If this file is present, the weights.bin or weights.csv file must also be present. The data should not contain nulls. + +##### aggregatevulnerabilitytobin +``` +$ aggregatevulnerabilitytobin < aggregate_vulnerability.csv > aggregate_vulnerability.bin +``` + +##### aggregatevulnerabilitytocsv +``` +$ aggregatevulnerabilitytocsv < aggregate_vulnerability.bin > aggregate_vulnerability.csv +``` + +[Return to top](#dataconversioncomponents) + ### damage bin dictionary *** @@ -197,6 +240,37 @@ $ footprinttocsv -z > footprint.csv [Return to top](#dataconversioncomponents) + +### Loss Factors +*** +The lossfactors binary maps the event_id/amplification_id pairs with post loss amplification factors, and is supplied by the model providers. The first 4 bytes are preserved for future use and the data format is as follows. It is required by Post Loss Amplification (PLA) workflow must have the following location and filename; + +* static/lossfactors.bin + +#### File format +The csv file should contain the following fields and include a header row. + +| Name | Type | Bytes | Description | Example | +|:------------------|--------|--------| :---------------------------------------------------------|------------:| +| event_id | int | 4 | Event ID | 1 | +| count | int | 4 | Number of amplification IDs associated with the event ID | 1 | +| amplification_id | int | 4 | Amplification ID | 1 | +| factor | float | 4 | The uplift factor | 1.01 | + +All fields must not have null values. The csv file will not contain the count, and the conversion tools will add/remove this count. + +##### lossfactorstobin +``` +$ lossfactorstobin < lossfactors.csv > lossfactors.bin +``` + +##### lossfactorstocsv +``` +$ lossfactorstocsv < lossfactors.bin > lossfactors.csv +``` + +[Return to top](#dataconversioncomponents) + ### Random numbers *** @@ -294,8 +368,67 @@ $ vulnerabilitytocsv -z > vulnerability.csv ``` [Return to top](#dataconversioncomponents) + +### Weights +*** +The vulnerability weights binary contains the the weighting of each vulnerability function in all areaperil IDs. The data format is as follows. It is required by gulmc with the aggregate_vulnerability file and must have the following location and filename; + +* static/weights.bin + +#### File format +The csv file should contain the following fields and include a header row. + +| Name | Type | Bytes | Description | Example | +|:------------------|--------|--------| :---------------------------------------------------------|------------:| +| areaperil_id | int | 4 | Areaperil ID | 1 | +| vulnerability_id | int | 4 | Vulnerability ID | 1 | +| weight | float | 4 | The weighting factor | 1.0 | + +All fields must not have null values. + +##### weightstobin +``` +$ weightstobin < weights.csv > weights.bin +``` + +##### weightstocsv +``` +$ weightstocsv < weights.bin > weights.csv +``` + +[Return to top](#dataconversioncomponents) + ## Input data + +### Amplifications +*** +The amplifications binary contains the list of item IDs mapped to amplification IDs. The data format is as follows. It is required by Post Loss Amplification (PLA) workflow must have the following location and filename; + +* input/amplifications.bin + +#### File format +The csv file should contain the following fields and include a header row. + +| Name | Type | Bytes | Description | Example | +|:------------------|--------|--------| :---------------------------------------------|------------:| +| item_id | int | 4 | Item ID | 1 | +| amplification_id | int | 4 | Amplification ID | 1 | + +The item_id must start from 1 and must be contiguous and not have null values. The binary file only contains the amplification IDs and assumes the item_ids would start from 1 and are contiguous. + +##### amplificationtobin +``` +$ amplificationtobin < amplifications.csv > amplifications.bin +``` + +##### amplificationtocsv +``` +$ amplificationtocsv < amplifications.bin > amplifications.csv +``` + +[Return to top](#dataconversioncomponents) + ### Coverages *** @@ -325,6 +458,31 @@ $ coveragetocsv < coverages.bin > coverages.csv [Return to top](#dataconversioncomponents) + +### ensemble +*** +The ensemble file is used for ensemble modelling (multiple views) which maps sample IDs to particular ensemble ID groups. It is an optional file for use with AAL and LEC. It must have the following location and filename; +* input/ensemble.bin + +##### File format +The csv file should contain a list of event_ids (integers) and include a header. + +| Name | Type | Bytes | Description | Example | +|:------------------|--------|--------| :-------------------|------------:| +| sidx | int | 4 | Sample ID | 1 | +| ensemble_id | int | 4 | Ensemble ID | 1 | + +##### ensembletobin +``` +$ ensembletobin < ensemble.csv > ensemble.bin +``` + +##### ensembletocsv +``` +$ ensembletocsv < ensemble.bin > ensemble.csv +``` +[Return to top](#dataconversioncomponents) + ### events *** @@ -818,6 +976,34 @@ $ periodstocsv < periods.bin > periods.csv [Return to top](#dataconversioncomponents) + +### Quantile +*** +The quantile binary file contains a list of user specified quantile floats. The data format is as follows. It is optionally used by the Quantile Event/Period Loss tables and must have the following location and filename; + +* input/quantile.bin + +#### File format +The csv file should contain the following fields and include a header row. + +| Name | Type | Bytes | Description | Example | +|:------------------|--------|--------| :---------------------------------------------------------|------------:| +| quantile | float | 4 | Quantile float | 0.1 | + +All fields must not have null values. + +##### quantiletobin +``` +$ quantiletobin < quantile.csv > quantile.bin +``` + +##### quantiletocsv +``` +$ quantiletocsv < quantile.bin > quantile.csv +``` + +[Return to top](#dataconversioncomponents) + [Go to 4.5 Stream conversion components section](StreamConversionComponents.md) [Back to Contents](Contents.md) diff --git a/docs/pdf/ktools.pdf b/docs/pdf/ktools.pdf new file mode 100644 index 00000000..5593d010 Binary files /dev/null and b/docs/pdf/ktools.pdf differ