Skip to content

Commit 57e014c

Browse files
v-kkhuangcasionone
andauthored
[feat][CGS][hive] add security control to prevent location clause usage in hive tasks (#968)
* #AI commit# 开发阶段: Hive任务禁止使用LOCATION功能 * #AI commit# 开发阶段: Hive任务禁止使用LOCATION功能,测试报告相关文件提交 * #AI commit# 开发阶段: * hive禁止location正则优化 * #AI commit# 开发阶段: entrance配置文件优化 * #AI commit# 开发阶段: 去除code打印 --------- Co-authored-by: Casion <casionone@gmail.com>
1 parent 6648906 commit 57e014c

19 files changed

Lines changed: 5777 additions & 2 deletions

File tree

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ nohup.out
4040
tools
4141

4242
nul
43+
/docs/project-knowledge/sessions/
4344

4445
#claude
4546
.claude

docs/dev-1.19.0-yarn-tag-update/design/hive_location_control_设计.md

Lines changed: 934 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
# language: zh-CN
2+
功能: Hive表Location路径控制
3+
4+
作为 数据平台管理员
5+
我希望能够禁止用户在CREATE TABLE语句中指定LOCATION参数
6+
以防止用户通过指定LOCATION路径创建表,保护数据安全
7+
8+
背景:
9+
Given Entrance服务已启动
10+
And location控制功能已启用
11+
12+
# ===== P0功能:拦截带LOCATION的CREATE TABLE =====
13+
14+
场景: 不带LOCATION的CREATE TABLE(成功)
15+
When 用户执行SQL:
16+
"""
17+
CREATE TABLE test_table (
18+
id INT,
19+
name STRING
20+
)
21+
"""
22+
Then 表创建成功
23+
And 不记录拦截日志
24+
25+
场景: 带LOCATION的CREATE TABLE(被拦截)
26+
When 用户执行SQL:
27+
"""
28+
CREATE TABLE test_table (
29+
id INT,
30+
name STRING
31+
)
32+
LOCATION '/user/hive/warehouse/test_table'
33+
"""
34+
Then 表创建失败
35+
And 错误信息包含: "Location parameter is not allowed in CREATE TABLE statement"
36+
And 审计日志记录: "sql_type=CREATE_TABLE, location=/user/hive/warehouse/test_table, is_blocked=true"
37+
38+
# ===== P0功能:功能开关 =====
39+
40+
场景: 禁用location控制后允许带LOCATION的CREATE TABLE
41+
Given location控制功能已禁用
42+
When 用户执行SQL:
43+
"""
44+
CREATE TABLE test_table (
45+
id INT,
46+
name STRING
47+
)
48+
LOCATION '/any/path/test_table'
49+
"""
50+
Then 表创建成功
51+
And 不执行location拦截
52+
53+
# ===== P1功能:CTAS语句 =====
54+
55+
场景: CTAS未指定location(成功)
56+
When 用户执行SQL:
57+
"""
58+
CREATE TABLE test_table AS
59+
SELECT * FROM source_table
60+
"""
61+
Then 表创建成功
62+
And 不记录拦截日志
63+
64+
场景: CTAS指定location(被拦截)
65+
When 用户执行SQL:
66+
"""
67+
CREATE TABLE test_table
68+
LOCATION '/user/hive/warehouse/test_table'
69+
AS
70+
SELECT * FROM source_table
71+
"""
72+
Then 表创建失败
73+
And 错误信息包含: "Location parameter is not allowed in CREATE TABLE statement"
74+
And 审计日志记录: "sql_type=CTAS, location=/user/hive/warehouse/test_table, is_blocked=true"
75+
76+
# ===== 不在范围:ALTER TABLE =====
77+
78+
场景: ALTER TABLE SET LOCATION(不拦截)
79+
When 用户执行SQL:
80+
"""
81+
ALTER TABLE test_table SET LOCATION '/user/hive/warehouse/new_table'
82+
"""
83+
Then 操作不被拦截
84+
And 执行结果由Hive引擎决定
85+
86+
# ===== 边界场景 =====
87+
88+
场景: CREATE TEMPORARY TABLE with LOCATION(被拦截)
89+
When 用户执行SQL:
90+
"""
91+
CREATE TEMPORARY TABLE temp_table (
92+
id INT
93+
)
94+
LOCATION '/tmp/hive/temp_table'
95+
"""
96+
Then 表创建失败
97+
And 错误信息包含: "Location parameter is not allowed in CREATE TABLE statement"
98+
99+
场景: CREATE EXTERNAL TABLE with LOCATION(被拦截)
100+
When 用户执行SQL:
101+
"""
102+
CREATE EXTERNAL TABLE external_table (
103+
id INT,
104+
name STRING
105+
)
106+
LOCATION '/user/hive/warehouse/external_table'
107+
"""
108+
Then 表创建失败
109+
And 错误信息包含: "Location parameter is not allowed in CREATE TABLE statement"
110+
111+
场景: 多行SQL格式带LOCATION(被拦截)
112+
When 用户执行SQL:
113+
"""
114+
CREATE TABLE test_table
115+
(
116+
id INT COMMENT 'ID',
117+
name STRING COMMENT 'Name'
118+
)
119+
COMMENT 'Test table'
120+
LOCATION '/user/hive/warehouse/test_table'
121+
"""
122+
Then 表创建失败
123+
And 错误信息包含: "Location parameter is not allowed in CREATE TABLE statement"
124+
125+
# ===== 性能测试场景 =====
126+
127+
场景: 大量并发建表操作(不带LOCATION)
128+
When 100个用户并发执行:
129+
"""
130+
CREATE TABLE test_table (id INT)
131+
"""
132+
Then 所有操作成功
133+
And 性能影响<3%
134+
135+
场景: 大量并发建表操作(带LOCATION)
136+
When 100个用户并发执行:
137+
"""
138+
CREATE TABLE test_table (id INT) LOCATION '/any/path'
139+
"""
140+
Then 所有操作都被拦截
141+
And 性能影响<3%
142+
143+
# ===== 错误处理场景 =====
144+
145+
场景: SQL语法错误
146+
When 用户执行SQL:
147+
"""
148+
CREATE TABLE test_table (
149+
id INT
150+
) LOCATIO '/invalid/path'
151+
"""
152+
Then SQL解析失败
153+
And 返回语法错误信息
154+
155+
场景: 空SQL语句
156+
When 用户执行空SQL
157+
Then 不执行location检查
158+
And 返回SQL为空的错误
159+
160+
# ===== 审计日志完整性 =====
161+
162+
场景: 验证所有被拦截的操作都有审计日志
163+
Given 用户执行以下操作:
164+
| SQL类型 | Location路径 |
165+
| CREATE_TABLE | /user/hive/warehouse/table1 |
166+
| CREATE_TABLE | /invalid/path |
167+
| CTAS | /user/data/table2 |
168+
When 检查审计日志
169+
Then 所有被拦截的操作都有日志记录
170+
And 日志包含: timestamp, user, sql_type, location_path, is_blocked, reason
171+
172+
# ===== 错误信息清晰度测试 =====
173+
174+
场景: 验证错误信息包含原始SQL
175+
When 用户执行SQL:
176+
"""
177+
CREATE TABLE test_table (id INT) LOCATION '/user/critical/data'
178+
"""
179+
Then 表创建失败
180+
And 错误信息包含: "Please remove the LOCATION clause and retry"
181+
And 错误信息包含原始SQL片段

0 commit comments

Comments
 (0)