@@ -9,14 +9,15 @@ paddle.distributed 目录包含的 API 支撑飞桨框架大规模分布式训
99- :ref: `环境配置和训练启动管理 <02 >`
1010- :ref: `数据加载 <03 >`
1111- :ref: `集合通信算法 API <04 >`
12- - :ref: `RPC API <05 >`
12+ - :ref: `Stream 集合通信高级 API <05 >`
13+ - :ref: `RPC API <06 >`
1314
1415.. _01 :
1516
1617Fleet 分布式高层 API
1718::::::::::::::::::::::::::
1819
19- paddle.distributed.fleet 是分布式训练的统一入口 API,用于配置分布式训练。
20+ `` paddle.distributed.fleet `` 是分布式训练的统一入口 API,用于配置分布式训练。
2021
2122.. csv-table ::
2223 :header: "API 名称", "API 功能"
@@ -53,6 +54,8 @@ paddle.distributed.fleet 是分布式训练的统一入口 API,用于配置分
5354 " :ref: `spawn <cn_api_distributed_spawn >` ", "启动分布式训练进程,仅支持集合通信架构"
5455 " :ref: `get_rank <cn_api_distributed_get_rank >` ", "获取当前进程的 rank 值"
5556 " :ref: `get_world_size <cn_api_distributed_get_world_size >` ", "获取当前进程数"
57+ " :ref: `new_group <cn_api_distributed_new_group >` ", "创建分布式通信组"
58+ " :ref: `destroy_process_group <cn_api_distributed_destroy_process_group >` ", "销毁分布式通信组"
5659
5760.. _03 :
5861
@@ -69,28 +72,56 @@ paddle.distributed.fleet 是分布式训练的统一入口 API,用于配置分
6972
7073.. _04 :
7174
72- 集合通信算法 API
75+ 集合通信 API
7376::::::::::::::::::::::
7477
75- 在集群上,对多设备的进程组的参数数据 tensor 或 object 进行计算处理。
78+ 在集群上,对多设备的进程组的参数数据 tensor 或 object 进行计算处理,包括规约、聚合、广播、分发等 。
7679
7780.. csv-table ::
7881 :header: "API 名称", "API 功能"
7982 :widths: 20, 50
8083
81-
82- " :ref: `reduce <cn_api_distributed_reduce >` ", "规约,规约进程组内的 tensor,返回结果至指定进程"
83- " :ref: `ReduceOP <cn_api_distributed_ReduceOp >` ", "规约,指定逐元素规约操作"
84- " :ref: `all_reduce <cn_api_distributed_all_reduce >` ", "组规约,规约进程组内的 tensor,结果广播至每个进程"
85- " :ref: `all_gather <cn_api_distributed_all_gather >` ", "组聚合,聚合进程组内的 tensor,结果广播至每个进程"
86- " :ref: `all_gather_object <cn_api_distributed_all_gather_object >` ", "组聚合,聚合进程组内的 object,结果广播至每个进程"
87- " :ref: `broadcast <cn_api_distributed_broadcast >` ", "广播一个 tensor 到每个进程"
88- " :ref: `scatter <cn_api_distributed_scatter >` ", "分发 tensor 到每个进程"
89- " :ref: `split <cn_api_distributed_split >` ", "切分参数到多个设备"
90- " :ref: `barrier <cn_api_distributed_barrier >` ", "同步路障,进行阻塞操作,实现组内所有进程的同步"
84+ " :ref: `ReduceOp <cn_api_distributed_ReduceOp >` ", "规约操作的类型"
85+ " :ref: `reduce <cn_api_distributed_reduce >` ", "规约进程组内的 tensor,随后将结果发送到指定进程"
86+ " :ref: `all_reduce <cn_api_distributed_all_reduce >` ", "规约进程组内的 tensor,随后将结果发送到每个进程"
87+ " :ref: `all_gather <cn_api_distributed_all_gather >` ", "聚合进程组内的 tensor,随后将结果发送到每个进程"
88+ " :ref: `all_gather_object <cn_api_distributed_all_gather_object >` ", "聚合进程组内的 object,随后将结果发送到每个进程"
89+ " :ref: `alltoall <cn_api_distributed_alltoall >` ", "将一组 tensor 分发到每个进程并进行聚合"
90+ " :ref: `alltoall_single <cn_api_distributed_alltoall_single >` ", "将一个 tensor 分发到每个进程并进行聚合"
91+ " :ref: `broadcast <cn_api_distributed_broadcast >` ", "将一个 tensor 发送到每个进程"
92+ " :ref: `scatter <cn_api_distributed_scatter >` ", "将一组 tensor 分发到每个进程"
93+ " :ref: `reduce_scatter <cn_api_distributed_reduce_scatter >` ", "规约一组 tensor,随后将规约结果分发到每个进程"
94+ " :ref: `isend <cn_api_distributed_isend >` ", "异步发送一个 tensor 到指定进程"
95+ " :ref: `irecv <cn_api_distributed_irecv >` ", "异步接收一个来自指定进程的 tensor"
96+ " :ref: `send <cn_api_distributed_send >` ", "发送一个 tensor 到指定进程"
97+ " :ref: `recv <cn_api_distributed_recv >` ", "接收一个来自指定进程的 tensor"
98+ " :ref: `barrier <cn_api_distributed_barrier >` ", "同步路障,阻塞操作以实现组内进程同步"
9199
92100.. _05 :
93101
102+ Stream 集合通信高级 API
103+ ::::::::::::::::::::::
104+
105+ ``paddle.distributed.stream `` 在集合通信 API 的基础上,提供更统一的语义和对计算流的更精细的控制能力,有助于在特定场景下提高性能。
106+
107+ .. csv-table ::
108+ :header: "API 名称", "API 功能"
109+ :widths: 25, 50
110+
111+
112+ " :ref: `stream.reduce <cn_api_distributed_stream_reduce >` ", "规约进程组内的 tensor,随后将结果发送到指定进程"
113+ " :ref: `stream.all_reduce <cn_api_distributed_stream_all_reduce >` ", "规约进程组内的 tensor,随后将结果发送到每个进程"
114+ " :ref: `stream.all_gather <cn_api_distributed_stream_all_gather >` ", "聚合进程组内的 tensor,随后将结果发送到每个进程"
115+ " :ref: `stream.alltoall <cn_api_distributed_stream_alltoall >` ", "分发一组 tensor 到每个进程并进行聚合"
116+ " :ref: `stream.alltoall_single <cn_api_distributed_stream_alltoall_single >` ", "分发一个 tensor 到每个进程并进行聚合"
117+ " :ref: `stream.broadcast <cn_api_distributed_stream_broadcast >` ", "发送一个 tensor 到每个进程"
118+ " :ref: `stream.scatter <cn_api_distributed_stream_scatter >` ", "分发一个 tensor 到每个进程"
119+ " :ref: `stream.reduce_scatter <cn_api_distributed_stream_reduce_scatter >` ", "规约一组 tensor,随后将规约结果分发到每个进程"
120+ " :ref: `stream.send <cn_api_distributed_stream_send >` ", "发送一个 tensor 到指定进程"
121+ " :ref: `stream.recv <cn_api_distributed_stream_recv >` ", "接收一个来自指定进程的 tensor"
122+
123+ .. _06 :
124+
94125RPC API
95126::::::::::::::::::::::::::
96127
0 commit comments