-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathCITATION.cff
More file actions
80 lines (79 loc) · 2.96 KB
/
CITATION.cff
File metadata and controls
80 lines (79 loc) · 2.96 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: 'CROP: Compact Reshaped Observation Processing'
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Philipp
family-names: Altmann
email: philipp.altmann@ifi.lmu.de
affiliation: LMU Munich
orcid: 'https://orcid.org/0000-0003-1134-176X'
identifiers:
- type: doi
value: 10.24963/ijcai.2023/380
repository-code: 'https://github.com/philippaltmann/CROP'
abstract: >-
The safe application of reinforcement learning (RL)
requires generalization from limited training data to
unseen scenarios. Yet, fulfilling tasks under changing
circumstances is a key challenge in RL. Current
state-of-the-art approaches for generalization apply data
augmentation techniques to increase the diversity of
training data. Even though this prevents overfitting to
the training environment(s), it hinders policy
optimization. Crafting a suitable observation, only
containing crucial information, has been shown to be a
challenging task itself. To improve data efficiency and
generalization capabilities, we propose Compact Reshaped
Observation Processing (CROP) to reduce the state
information used for policy optimization. By providing
only relevant information, overfitting to a specific
training layout is precluded and generalization to unseen
environments is improved. We formulate three CROPs that
can be applied to fully observable observation- and
action-spaces and provide methodical foundation. We
empirically show the improvements of CROP in a
distributionally shifted safety gridworld. We furthermore
provide benchmark comparisons to full observability and
data-augmentation in two different-sized procedurally
generated mazes.
keywords:
- 'reinforcement learning'
- 'robustness'
license: MIT
preferred-citation:
type: conference-paper
authors:
- given-names: Philipp
family-names: Altmann
email: philipp.altmann@ifi.lmu.de
affiliation: LMU Munich
orcid: 'https://orcid.org/0000-0003-1134-176X'
- given-names: Fabian
family-names: Ritz
affiliation: LMU Munich
- given-names: Leonard
family-names: Feuchtinger
affiliation: LMU Munich
- given-names: Jonas
family-names: Nüßlein
affiliation: LMU Munich
- given-names: Claudia
family-names: Linnhoff-Popien
affiliation: LMU Munich
- given-names: Thomy
family-names: Phan
affiliation: LMU Munich
title: "CROP: Towards Distributional-Shift Robust Reinforcement Learning Using Compact Reshaped Observation Processing"
year: 2023
month: 8
collection-title: "Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, {IJCAI-23}"
start: 3414 # First page number
end: 3422 # Last page number
doi: "10.24963/ijcai.2023/380"
url: 'https://doi.org/10.24963/ijcai.2023/380'
note: 'Main Track'