Skip to content

Commit 2a2990b

Browse files
authored
Merge pull request #82 from HYPERNETS/metadata_db
Metadata db
2 parents 34963a2 + 1925052 commit 2a2990b

28 files changed

+617
-238
lines changed

docs/sphinx/content/users/user_processor.rst

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,8 @@
55
66
.. _user_processor:
77

8-
Automated Processing User Guide
9-
===============================
8+
User Guide - Automated Processing
9+
=================================
1010

1111
This section provides a user guide for running the `hypernets_processor` module as an automated processor of incoming field data. In this scenario, a set of field hypstar systems are regularly syncing raw data to a server. Running on this server, the `hypernets_processor` processes the data and adds it to an archive that can be accessed through a user portal.
1212

@@ -74,22 +74,22 @@ As well as defining required job configuration information, the job configuratio
7474
For all jobs, it is important relevant metadata be added to the metadata database, so it can be added to the data products.
7575

7676
.. _user_processor-scheduler:
77-
Run Scheduler
78-
-------------
77+
Running Job Scheduler
78+
---------------------
7979

8080
Once setup the automated processing scheduler can be started with::
8181

82-
$ hypernets_processor_scheduler
82+
$ hypernets_scheduler
8383

8484
To see options, try::
8585

86-
$ hypernets_processor_scheduler --help
86+
$ hypernets_scheduler --help
8787

8888
All jobs are run regularly, processing any new data synced to the server from the field since the last run. The run schedule is defined in the scheduler config, which may be edited as::
8989

9090
$ vim <installation_directory>/hypernets_processor/etc/scheduler.config
9191

92-
Processed products are added to the data archive and listed in the archive database. Any anomolies are add to the anomoly database. More detailed job related log information is added to the job log file. Summary log information for all jobs is added to the processor log file.
92+
Processed products are added to the data archive and listed in the archive database. Any anomalies are add to the anomaly database. More detailed job related log information is added to the job log file. Summary log information for all jobs is added to the processor log file.
9393

9494
To amend the list of scheduled jobs, edit the list of job configuration files listed in the processor jobs file as::
9595

docs/sphinx/content/users/users.rst

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,6 @@
88
User Guide
99
==========
1010

11-
Usage
12-
-----
13-
1411
There are two main use cases for the hypernets_processor package. The primary function of the software is the automated preparation of data retrieved from network sites for distribution to users. Additionally, the software may also be used for ad-hoc processing of particular field acquisitions, for example for testing instrument operation in the field. For information on each these use cases click on one of the following links:
1512

1613

hypernets_processor/cli/hypernets_processor_cli.py

Lines changed: 0 additions & 60 deletions
This file was deleted.

hypernets_processor/cli/hypernets_scheduler_cli.py renamed to hypernets_processor/cli/scheduler_cli.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
"""
2-
scheduler for hypernets_processor jobs cli
2+
cli for job scheduler
33
"""
44

55
from hypernets_processor.version import __version__
6-
from hypernets_processor.cli.hypernets_scheduler_main import main
6+
from hypernets_processor.main.scheduler_main import main
77
from hypernets_processor.utils.config import SCHEDULER_CONFIG_PATH
88
import argparse
99

@@ -41,7 +41,7 @@ def configure_parser():
4141

4242
def cli():
4343
"""
44-
Command line interface function for hypernets_scheduler_main
44+
Command line interface function for scheduler_main
4545
"""
4646

4747
# run main
Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
"""
2+
hypernets_processor cli
3+
"""
4+
5+
from hypernets_processor.version import __version__
6+
from hypernets_processor.utils.config import (
7+
PROCESSOR_CONFIG_PATH,
8+
JOB_CONFIG_TEMPLATE_PATH,
9+
PROCESSOR_WATER_DEFAULTS_CONFIG_PATH,
10+
PROCESSOR_LAND_DEFAULTS_CONFIG_PATH
11+
)
12+
from hypernets_processor.utils.cli import configure_std_parser
13+
from hypernets_processor.utils.config import read_config_file
14+
from hypernets_processor.main.sequence_processor_main import main
15+
from datetime import datetime as dt
16+
import sys
17+
import os
18+
19+
20+
'''___Authorship___'''
21+
__author__ = "Sam Hunt"
22+
__created__ = "26/3/2020"
23+
__version__ = __version__
24+
__maintainer__ = "Sam Hunt"
25+
__email__ = "sam.hunt@npl.co.uk"
26+
__status__ = "Development"
27+
28+
29+
def configure_parser():
30+
"""
31+
Configure parser
32+
33+
:return: parser
34+
:rtype: argparse.ArgumentParser
35+
"""
36+
37+
description = "Tool for processing Hypernets Land and Water Network hyperspectral field data"
38+
39+
# Create standard parser
40+
parser = configure_std_parser(description=description)
41+
42+
# Add specific arguments
43+
parser.add_argument("-i", "--input-directory", action="store",
44+
help="Directory of input data")
45+
parser.add_argument("-o", "--output-directory", action="store",
46+
help="Directory to write output data to")
47+
parser.add_argument("-n", "--network", action="store", choices=["land", "water"],
48+
help="Network to process file for")
49+
# parser.add_argument("--plot", action="store_true",
50+
# help="Generate plots of processed data")
51+
parser.add_argument("--write-all", action="store_true",
52+
help="Write all products at intermediate data processing levels before final product")
53+
parser.add_argument("-j", "--job-config", action="store",
54+
help="Use instead of above arguments to specify job with configuration file")
55+
return parser
56+
57+
58+
parser = configure_parser()
59+
parsed_args = parser.parse_args()
60+
61+
62+
if parsed_args.job_config and (
63+
parsed_args.input_directory or parsed_args.output_directory
64+
or parsed_args.network or parsed_args.write_all
65+
):
66+
print("-j is mutually exclusive with other input arguments")
67+
sys.exit(2)
68+
69+
70+
def cli():
71+
"""
72+
Command line interface to sequence_processor_main for ad-hoc job processing
73+
"""
74+
75+
# If job config specified use
76+
if parsed_args.job_config:
77+
tmp_job = False
78+
job_config_path = parsed_args.job_config
79+
80+
# Else build and write temporary job config from command line arguments
81+
else:
82+
tmp_job = True
83+
84+
processor_defaults = PROCESSOR_WATER_DEFAULTS_CONFIG_PATH
85+
if parsed_args.network == "land":
86+
processor_defaults = PROCESSOR_LAND_DEFAULTS_CONFIG_PATH
87+
88+
job_config = read_config_file([JOB_CONFIG_TEMPLATE_PATH, processor_defaults])
89+
90+
if parsed_args.input_directory is not None:
91+
job_config["Input"]["raw_data_directory"] = os.path.abspath(parsed_args.input_directory)
92+
else:
93+
print("-i required")
94+
sys.exit(2)
95+
96+
if parsed_args.output_directory is not None:
97+
job_config["Output"]["archive_directory"] = os.path.abspath(parsed_args.output_directory)
98+
else:
99+
print("-o required")
100+
sys.exit(2)
101+
102+
if parsed_args.write_all:
103+
for key in job_config["Output"].keys():
104+
if key[:5] == "write":
105+
job_config["Output"][key] = "True"
106+
107+
job_config["Log"]["log_path"] = os.path.abspath(parsed_args.log) if parsed_args.log is not None else ""
108+
job_config["Log"]["verbose"] = str(parsed_args.verbose) if parsed_args.verbose is not None else ""
109+
job_config["Log"]["quiet"] = str(parsed_args.quiet) if parsed_args.verbose is not None else ""
110+
111+
job_config["Job"]["job_name"] = "run_" + dt.now().strftime("%Y%m%dT%H%M%S")
112+
home_directory = os.path.expanduser("~")
113+
job_config["Job"]["job_working_directory"] = os.path.join(home_directory, ".hypernets", "tmp")
114+
job_config_path = os.path.join(
115+
job_config["Job"]["job_working_directory"],
116+
job_config["Job"]["job_name"] + ".config"
117+
)
118+
119+
os.makedirs(job_config["Job"]["job_working_directory"], exist_ok=True)
120+
with open(job_config_path, "w") as f:
121+
job_config.write(f)
122+
123+
# run main
124+
main(processor_config_path=PROCESSOR_CONFIG_PATH, job_config_path=job_config_path, to_archive=False)
125+
126+
if tmp_job:
127+
os.remove(job_config_path)
128+
129+
return None
130+
131+
132+
if __name__ == "__main__":
133+
pass

hypernets_processor/cli/setup_processor_cli.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ def cli():
8787
default="sqlite:///"+os.path.join(settings["working_directory"], db_fmt+".db"),
8888
)
8989

90-
settings["log_path"] = os.path.join(settings["working_directory"], "processor.log")
90+
settings["log_path"] = os.path.join(settings["working_directory"], "scheduler.log")
9191

9292
main(settings)
9393

hypernets_processor/context.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ def __init__(self, processor_config=None, job_config=None, logger=None):
3636
self.config_values = {}
3737
self.logger = logger
3838
self.metadata_db = None
39-
self.anomoly_db = None
39+
self.anomaly_db = None
4040
self.archive_db = None
4141

4242
# Unpack processor_config to set relevant attributes
@@ -49,7 +49,7 @@ def __init__(self, processor_config=None, job_config=None, logger=None):
4949
job_config, protected_values=PROCESSOR_CONFIG_PROTECTED_VALUES
5050
)
5151

52-
# Connect to anomoly databases
52+
# Connect to databases
5353
db_fmts = DB_DICT_DEFS.keys()
5454
for db_fmt in db_fmts:
5555
if db_fmt + "_db_url" in self.get_config_names():

hypernets_processor/data_io/format/databases.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,8 +40,13 @@
4040
# Metadata Database
4141
METADATA_DB = {}
4242

43-
# Anolomy Database
44-
ANOMOLY_DB = {}
43+
# Anomaly Database
44+
ANOMALY_DB = {"anomalies": {"columns": {"anomaly": {"type": str},
45+
"raw_product_name": {"type": str},
46+
"site": {"type": str}
47+
}
48+
}
49+
}
4550

4651
# Archive Database
4752
ARCHIVE_DB = {"products": {"columns": {"product_name": {"type": str},
@@ -56,5 +61,5 @@
5661
# --------------------
5762

5863
DB_DICT_DEFS = {"metadata": METADATA_DB,
59-
"anomoly": ANOMOLY_DB,
64+
"anomaly": ANOMALY_DB,
6065
"archive": ARCHIVE_DB}

0 commit comments

Comments
 (0)