[Example] Add preformer for precipitation nowcasting by EricKing19 · Pull Request #976 · PaddlePaddle/PaddleScience

EricKing19 · 2024-08-19T01:59:49Z

PR types

Others

PR changes

Others

Describe

add Preformer model for precipitation nowcasting
add docs for Preformer
add examples for Preformer

paddle-bot · 2024-08-19T01:59:54Z

Thanks for your contribution!

CLAassistant · 2024-08-19T01:59:55Z

All committers have signed the CLA.

HydrogenSulfate

感谢提交PR，有几处小问题麻烦看一下

HydrogenSulfate · 2024-08-20T06:38:17Z

+
+    ``` sh
+    # 模型训练
+    python examples/preformer/train.py


Suggested change

python examples/preformer/train.py

python train.py

HydrogenSulfate · 2024-08-20T06:38:25Z

+
+    ``` sh
+    # 模型评估
+    python examples/preformer/train.py mode=eval


Suggested change

python examples/preformer/train.py mode=eval

python train.py mode=eval

HydrogenSulfate · 2024-08-20T06:38:45Z

文件建议改名为main.py

HydrogenSulfate · 2024-08-20T06:39:19Z

+    # set random seed for reproducibility
+    ppsci.utils.misc.set_random_seed(cfg.seed)
+    # initialize logger
+    logger.init_logger("ppsci", osp.join(cfg.output_dir, "train.log"), "info")
+


HydrogenSulfate · 2024-08-20T06:42:37Z

+                "num_replicas": NUM_GPUS_PER_NODE,
+                "rank": dist.get_rank() % NUM_GPUS_PER_NODE,


这两个参数应该不需要，并且paddlescience也没有对应的处理逻辑，默认会根据环境中设置的卡数自动设置

HydrogenSulfate · 2024-08-20T06:54:11Z

+            mon = str("0") + mon
+        day = str(self.time_table[idxs].timetuple().tm_mday)
+        if len(day) == 1:
+            day = str("0") + day


str("0")是否可以直接写成"0"？，下同

HydrogenSulfate · 2024-08-20T06:54:48Z

+        r_data = np.load(
+            os.path.join(self.file_path, year, "r_" + year + mon + day + hour + ".npy")
+        )
+        t_data = np.load(
+            os.path.join(self.file_path, year, "t_" + year + mon + day + hour + ".npy")
+        )
+        u_data = np.load(
+            os.path.join(self.file_path, year, "u_" + year + mon + day + hour + ".npy")
+        )
+        v_data = np.load(
+            os.path.join(self.file_path, year, "v_" + year + mon + day + hour + ".npy")
+        )


可以直接使用f-string化简字符串拼接的写法

HydrogenSulfate · 2024-08-20T06:55:45Z

+hydra:
+  run:
+    # dynamic output directory according to running time and override name
+    dir: outputs_preformer
+  job:
+    name: ${mode} # name of logfile
+    chdir: false # keep current working directory unchanged
+    config:
+      override_dirname:
+        exclude_keys:
+          - TRAIN.checkpoint_path
+          - TRAIN.trained_model_path
+          - EVAL.trained_model_path
+          - mode
+          - output_dir
+          - log_freq
+  sweep:
+    # output directory for multirun
+    dir: ${hydra.run.dir}
+    subdir: ./
+


Suggested change

hydra:

run:

# dynamic output directory according to running time and override name

dir: outputs_preformer

job:

name: ${mode} # name of logfile

chdir: false # keep current working directory unchanged

config:

override_dirname:

exclude_keys:

- TRAIN.checkpoint_path

- TRAIN.trained_model_path

- EVAL.trained_model_path

- mode

- output_dir

- log_freq

sweep:

# output directory for multirun

dir: ${hydra.run.dir}

subdir: ./

defaults:

- ppsci_default

- TRAIN: train_default

- TRAIN/ema: ema_default

- TRAIN/swa: swa_default

- EVAL: eval_default

- INFER: infer_default

- hydra/job/config/override_dirname/exclude_keys: exclude_keys_default

- _self_

hydra:

run:

# dynamic output directory according to running time and override name

dir: outputs_preformer

job:

name: ${mode} # name of logfile

chdir: false # keep current working directory unchanged

sweep:

# output directory for multirun

dir: ${hydra.run.dir}

subdir: ./

HydrogenSulfate · 2024-08-20T06:56:07Z

+
+# model settings
+MODEL:
+  afno:


单模型可以删除afno这一层级

HydrogenSulfate · 2024-08-20T07:04:21Z

+  afno:
+    input_keys: ["input"]
+    output_keys: ["output"]
+    shape_in: [6, 12, IMG_H, IMG_W]


Suggested change

shape_in: [6, 12, IMG_H, IMG_W]

shape_in:

- 6

- 12

- ${IMG_H}

- ${IMG_W}

HydrogenSulfate · 2024-08-20T11:54:07Z

@EricKing19 标题已经修改过了，原先的merge code of upstream不太合适

liaoxin2 · 2024-08-27T02:55:54Z

+案例中使用了预处理的 PEMSD4 和 PEMSD8 数据集。PEMSD4 为旧金山湾区交通数据，选取 29 条道路上 307 个传感器记录的交通数据，时间为 2018 年 1 月至 2 月。PEMSD8 为圣贝纳迪诺 8 条道路上 170 个检测器收集的交通数据，时间为 2016 年 7 月至 8 月。
+
+两个数据集均被保存为 N x T x 1 的矩阵，记录了相应交通节点与时间的流量数据，其中 N 为交通节点数量，T 为时间序列长度。两个数据集分别按照 7:2:1 划分为训练集、验证集，和测试集。案例中预先计算了流量数据的均值与标准差，用于后续的正则化操作。


该案例是关于降水的，这个数据集好像是交通的，数据集与代码不一致

HydrogenSulfate · 2024-10-15T16:58:32Z

+开始训练、评估前，请下载数据集文件
+
+开始评估前，请下载或训练生成预训练模型
+


可以稍微介绍一下数据集的准备过程吗？比如如何下载和解压后的文件组织形式？

HydrogenSulfate · 2024-10-15T16:58:56Z

+=== "模型训练命令"
+
+    ``` sh
+    # 模型训练


删除这个注释，上面这个标签已经说明了这是模型训练命令了

HydrogenSulfate · 2024-10-15T16:59:00Z

+=== "模型评估命令"
+
+    ``` sh
+    # 模型评估


同上，删除该行注释

HydrogenSulfate · 2024-10-15T17:00:47Z

+
+    ``` sh
+    # 模型评估
+    python train.py mode=eval


这里麻烦提供一下您训练好的预训练模型文件(.pdparams文件即可)，我们上传到bce上，这样就能通过在命令里直接指定预训练模型url直接下载并在评估前自动加载权重，不需要额外的手动下载了

HydrogenSulfate · 2024-10-15T17:05:21Z

+#### 3.2.6 模型导出
+
+通过设置 `ppsci.solver.Solver` 中的 `eval_during_train` 和 `eval_freq` 参数，可以自动保存在验证集上效果最优的模型参数。
+
+``` py linenums="100" title="examples/preformer/train.py"
+--8<--
+examples/preformer/train.py:158:158
+--8<--
+```
+


模型导出章节可以不用出现在文章中，删除

请补充模型导出的函数def export和def inference到examples\preformer\main.py中，参考：

PaddleScience/examples/allen_cahn/allen_cahn_piratenet.py

Lines 235 to 269 in 83f6739

def export(cfg: DictConfig):

# set model

model = ppsci.arch.PirateNet(**cfg.MODEL)

# initialize solver

solver = ppsci.solver.Solver(model, cfg=cfg)

# export model

from paddle.static import InputSpec

input_spec = [

{key: InputSpec([None, 1], "float32", name=key) for key in model.input_keys},

]

solver.export(input_spec, cfg.INFER.export_path, with_onnx=False)

def inference(cfg: DictConfig):

from deploy.python_infer import pinn_predictor

predictor = pinn_predictor.PINNPredictor(cfg)

data = sio.loadmat(cfg.DATA_PATH)

u_ref = data["usol"].astype(dtype) # (nt, nx)

t_star = data["t"].flatten().astype(dtype) # [nt, ]

x_star = data["x"].flatten().astype(dtype) # [nx, ]

tx_star = misc.cartesian_product(t_star, x_star).astype(dtype)

input_dict = {"t": tx_star[:, 0:1], "x": tx_star[:, 1:2]}

output_dict = predictor.predict(input_dict, cfg.INFER.batch_size)

# mapping data to cfg.INFER.output_keys

output_dict = {

store_key: output_dict[infer_key]

for store_key, infer_key in zip(cfg.MODEL.output_keys, output_dict.keys())

}

u_pred = output_dict["u"].reshape([len(t_star), len(x_star)])

plot(t_star, x_star, u_ref, u_pred, cfg.output_dir)

模型导出和模型推理执行命令请添加到文档开头处的"=== "模型评估命令""后面

HydrogenSulfate · 2024-10-15T17:19:31Z

+        return latent
+
+
+class Mid_Xnet(nn.Layer):


Mid_Xnet建议改为MidXNet，命名更规范

HydrogenSulfate · 2024-10-15T17:19:46Z

+    def forward(self, hid, enc1=None):
+        for i in range(0, len(self.dec)):
+            hid = self.dec[i](hid)
+        # Y = self.dec[-1](torch.cat([hid, enc1], dim=1))


这行注释是否可以删除？

HydrogenSulfate · 2024-10-15T17:20:07Z

+        for m in range(self.sq_length):
+            x.append(self.load_data(global_idx + m))
+        for n in range(self.sq_length):
+            # y.append(self.load_data(global_idx+n))


这行注释是否可以删除？

HydrogenSulfate · 2024-10-15T17:20:56Z

+            # y.append(self.load_data(global_idx+n))
+            y.append(self.precipitation["tp"][global_idx + self.sq_length + n])
+        # x = self.Normalize(x)
+        x, y = self.RandomCrop(x, y)


self.RandomCrop是否应该是self._random_crop?

HydrogenSulfate · 2024-10-15T17:21:37Z

+    def _random_crop(self, x, y):
+        if isinstance(self.size, numbers.Number):
+            self.size = (int(self.size), int(self.size))
+        th, tw = self.size
+        h, w = y[0].shape[-2], y[0].shape[-1]
+        x1 = random.randint(0, w - tw)
+        y1 = random.randint(0, h - th)
+
+        for i in range(len(x)):
+            x[i] = self.crop(x[i], y1, x1, y1 + th, x1 + tw)
+        for i in range(len(y)):
+            y[i] = self.crop(y[i], y1, x1, y1 + th, x1 + tw)
+
+        return x, y
+
+    def crop(self, im, x_start, y_start, x_end, y_end):
+        if len(im.shape) == 3:
+            return im[:, x_start:x_end, y_start:y_end]
+        else:
+            return im[x_start:x_end, y_start:y_end]


非公开方法前面建议加上下划线：

Suggested change

def _random_crop(self, x, y):

if isinstance(self.size, numbers.Number):

self.size = (int(self.size), int(self.size))

th, tw = self.size

h, w = y[0].shape[-2], y[0].shape[-1]

x1 = random.randint(0, w - tw)

y1 = random.randint(0, h - th)

for i in range(len(x)):

x[i] = self.crop(x[i], y1, x1, y1 + th, x1 + tw)

for i in range(len(y)):

y[i] = self.crop(y[i], y1, x1, y1 + th, x1 + tw)

return x, y

def crop(self, im, x_start, y_start, x_end, y_end):

if len(im.shape) == 3:

return im[:, x_start:x_end, y_start:y_end]

else:

return im[x_start:x_end, y_start:y_end]

def _random_crop(self, x, y):

if isinstance(self.size, numbers.Number):

self.size = (int(self.size), int(self.size))

th, tw = self.size

h, w = y[0].shape[-2], y[0].shape[-1]

x1 = random.randint(0, w - tw)

y1 = random.randint(0, h - th)

for i in range(len(x)):

x[i] = self._crop(x[i], y1, x1, y1 + th, x1 + tw)

for i in range(len(y)):

y[i] = self._crop(y[i], y1, x1, y1 + th, x1 + tw)

return x, y

def _crop(self, im, x_start, y_start, x_end, y_end):

if len(im.shape) == 3:

return im[:, x_start:x_end, y_start:y_end]

else:

return im[x_start:x_end, y_start:y_end]

HydrogenSulfate · 2024-10-16T04:15:32Z

@EricKing19 顺带解决一下冲突问题

merge code of upstream

b7e0216

paddle-bot Bot added the contributor label Aug 19, 2024

HydrogenSulfate requested changes Aug 20, 2024

View reviewed changes

HydrogenSulfate changed the title ~~merge code of upstream~~ [Example] Add preformer for precipitation nowcasting Aug 20, 2024

luotao1 self-assigned this Aug 21, 2024

liaoxin2 reviewed Aug 27, 2024

View reviewed changes

EricKing19 added 6 commits October 15, 2024 10:39

Update preformer.md

9092f86

Update preformer.md

038b3fc

Update train.yaml

52adb6a

Update and rename train.py to main.py

c110ae4

Update era5sq_dataset.py

88eda6a

Update era5sq_dataset.py

b99da76

HydrogenSulfate requested changes Oct 16, 2024

View reviewed changes

EricKing19 and others added 2 commits December 30, 2024 18:01

Update preformer.md

4511659

Merge remote-tracking branch 'upstream/develop' into pr/976

945ba45

HydrogenSulfate closed this Jan 5, 2026

	python examples/preformer/train.py mode=eval
	python train.py mode=eval

		"num_replicas": NUM_GPUS_PER_NODE,
		"rank": dist.get_rank() % NUM_GPUS_PER_NODE,

		案例中使用了预处理的 PEMSD4 和 PEMSD8 数据集。PEMSD4 为旧金山湾区交通数据，选取 29 条道路上 307 个传感器记录的交通数据，时间为 2018 年 1 月至 2 月。PEMSD8 为圣贝纳迪诺 8 条道路上 170 个检测器收集的交通数据，时间为 2016 年 7 月至 8 月。

		两个数据集均被保存为 N x T x 1 的矩阵，记录了相应交通节点与时间的流量数据，其中 N 为交通节点数量，T 为时间序列长度。两个数据集分别按照 7:2:1 划分为训练集、验证集，和测试集。案例中预先计算了流量数据的均值与标准差，用于后续的正则化操作。

		开始训练、评估前，请下载数据集文件

		开始评估前，请下载或训练生成预训练模型

	def export(cfg: DictConfig):
	# set model
	model = ppsci.arch.PirateNet(**cfg.MODEL)

	# initialize solver
	solver = ppsci.solver.Solver(model, cfg=cfg)
	# export model
	from paddle.static import InputSpec

	input_spec = [
	{key: InputSpec([None, 1], "float32", name=key) for key in model.input_keys},
	]
	solver.export(input_spec, cfg.INFER.export_path, with_onnx=False)


	def inference(cfg: DictConfig):
	from deploy.python_infer import pinn_predictor

	predictor = pinn_predictor.PINNPredictor(cfg)
	data = sio.loadmat(cfg.DATA_PATH)
	u_ref = data["usol"].astype(dtype) # (nt, nx)
	t_star = data["t"].flatten().astype(dtype) # [nt, ]
	x_star = data["x"].flatten().astype(dtype) # [nx, ]
	tx_star = misc.cartesian_product(t_star, x_star).astype(dtype)

	input_dict = {"t": tx_star[:, 0:1], "x": tx_star[:, 1:2]}
	output_dict = predictor.predict(input_dict, cfg.INFER.batch_size)
	# mapping data to cfg.INFER.output_keys
	output_dict = {
	store_key: output_dict[infer_key]
	for store_key, infer_key in zip(cfg.MODEL.output_keys, output_dict.keys())
	}
	u_pred = output_dict["u"].reshape([len(t_star), len(x_star)])

	plot(t_star, x_star, u_ref, u_pred, cfg.output_dir)

Conversation

EricKing19 commented Aug 19, 2024

PR types

PR changes

Describe

Uh oh!

paddle-bot Bot commented Aug 19, 2024

Uh oh!

CLAassistant commented Aug 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HydrogenSulfate left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HydrogenSulfate commented Aug 20, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HydrogenSulfate commented Oct 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

CLAassistant commented Aug 19, 2024 •

edited

Loading