Problem
The Java Streams skill should treat collect(Collectors.toCollection(...)) as a collector choice that needs justification. It is not always wrong, but it is more specific than ordinary collection, so the agent should verify that the concrete implementation matters and either simplify the collector or explain the reason in code.
This is eval-worthy because the code can be functionally correct while still being unclear: a reader cannot tell whether HashSet, ArrayList, or LinkedHashSet was chosen for mutability, ordering, membership checks, or by habit.
Code before the prompt was executed
The code had several toCollection(...) uses without comments. Some were justified but unexplained:
List<Card> normalized = boardCards.stream()
.map(card -> normalize(card, context, config))
.flatMap(Optional::stream)
.filter(card -> isTerminal(card, config))
.collect(Collectors.toCollection(ArrayList::new));
Set<String> archivedListIds = context.lists().values().stream()
.filter(BoardList::closed)
.map(BoardList::id)
.collect(Collectors.toCollection(HashSet::new));
Another site required encounter-order-preserving de-duplication, but that reason was not visible:
SequencedSet<Path> unconnectedWorkflowPaths = separateUnconnectedWorkflows
? reportWorkflowPaths(selectedManifest, selection.workflow(), paths.configDir(), true).stream()
.filter(workflow -> selectedWorkflowPaths.stream()
.noneMatch(connected -> PathsEqual.samePath(connected, workflow)))
.collect(Collectors.toCollection(LinkedHashSet::new))
: new LinkedHashSet<>();
The stream-formatting regression test also used toCollection(...) in sample text even though the sample only needed any collector-style terminal operation:
String source = "var labels = card.labels().stream().map(StateNames::normalize)"
+ ".collect(Collectors.toCollection(HashSet::new));";
Prompt that caused the implementation
The user asked:
every time you have collect(Collectors.toCollection anywhere, please also think about why it's necessary and add a comment explaining why it was done that way. for example to me it's not obvious why we need collect(Collectors.toCollection(LinkedHashSet::new)) vs just a regular set (i assume to keep the order? i would make it more obvious by adding a comment anyway) also make this durable
Prompt-produced code before maintainer correction
There was no new weak implementation in response to this prompt; the issue was that existing code and one earlier formatter-test sample used toCollection(...) without making the collector choice self-explanatory.
A first attempt simplified one collector to Collectors.toSet(), but that violated the repository's CollectorMutability rule because JDK Collectors.toSet() does not make mutability explicit:
Set<String> archivedListIds = context.lists().values().stream()
.filter(BoardList::closed)
.map(BoardList::id)
.collect(Collectors.toSet());
Why the prompt-produced code is bad
The original toCollection(...) uses were not necessarily incorrect. The problem is that they required readers to infer intent:
ArrayList::new was needed because the list is appended to later.
HashSet::new was needed at some sites because the code later mutates the set or performs membership checks.
LinkedHashSet::new was needed because diagnostics de-duplicate while preserving encounter order.
- Other
HashSet::new uses were not actually necessary and could be simplified.
The intermediate Collectors.toSet() simplification was also not suitable in a codebase that enforces explicit collector mutability.
Maintainer-preferred code
Keep toCollection(...) only where the concrete collection type matters, and add a short rationale comment nearby:
List<Card> normalized = boardCards.stream()
.map(card -> normalize(card, context, config))
.flatMap(Optional::stream)
.filter(card -> isTerminal(card, config))
// Keep this mutable: archived terminal cards are appended below before de-duplication.
.collect(Collectors.toCollection(ArrayList::new));
Set<Integer> reserved = manifest.boards().stream()
.filter(board -> !board.boardId().equals(ignoredBoard.boardId()))
.map(ConnectedBoard::serverPort)
// Use a mutable HashSet so file-discovered reservations can be merged below and
// port scanning can use constant-time membership checks.
.collect(Collectors.toCollection(HashSet::new));
SequencedSet<Path> unconnectedWorkflowPaths = separateUnconnectedWorkflows
? reportWorkflowPaths(selectedManifest, selection.workflow(), paths.configDir(), true).stream()
.filter(workflow -> selectedWorkflowPaths.stream()
.noneMatch(connected -> PathsEqual.samePath(connected, workflow)))
// Diagnostics should de-duplicate paths while preserving encounter order.
.collect(Collectors.toCollection(LinkedHashSet::new))
: new LinkedHashSet<>();
When the concrete implementation does not matter, use an explicit simpler collector or terminal operation instead. In this repository, use Guava immutable collectors rather than ambiguous JDK Collectors.toSet():
Set<String> archivedListIds = context.lists().values().stream()
.filter(BoardList::closed)
.map(BoardList::id)
.collect(toImmutableSet());
For a formatter-test sample that only needs a collector-style terminal operation, avoid unnecessary toCollection(...) entirely:
String source = "var labels = card.labels().stream().map(StateNames::normalize).collect(Collectors.toSet());";
Why the replacement is better
The replacement distinguishes semantic collector requirements from habit:
- Remaining
toCollection(...) sites document mutability, membership, or order requirements.
- Unnecessary concrete collection choices are removed.
- The code respects Error Prone/Picnic
CollectorMutability by avoiding ambiguous JDK Collectors.toSet() in compiled code.
- Formatter tests no longer normalize unnecessary product-code collector habits.
Desired eval behavior
- Reward noticing
Collectors.toCollection(...) and asking whether the concrete collection implementation is needed.
- Reward simplifying unnecessary
toCollection(...) uses.
- Reward preserving
toCollection(...) with a concise rationale comment when mutability, encounter order, sorted order, or membership behavior matters.
- Reward respecting repository rules that prefer explicit mutability, such as Guava immutable collectors over JDK
Collectors.toSet().
- Reward triggering the Streams skill even when the user asks for comments and durability rather than a direct stream refactor.
Anti-patterns the eval should reject
- Leaving
toCollection(...) without explaining why that concrete type matters.
- Replacing every
toCollection(...) mechanically with toSet() or toList() without checking mutability or order requirements.
- Adding vague comments such as
// collect to set that restate the code instead of explaining the contract.
- Keeping
LinkedHashSet::new without saying that encounter order is required.
- Introducing
Collectors.toSet() in a repository that enforces explicit collector mutability.
Suggested eval name
collectors-to-collection-rationale
Problem
The Java Streams skill should treat
collect(Collectors.toCollection(...))as a collector choice that needs justification. It is not always wrong, but it is more specific than ordinary collection, so the agent should verify that the concrete implementation matters and either simplify the collector or explain the reason in code.This is eval-worthy because the code can be functionally correct while still being unclear: a reader cannot tell whether
HashSet,ArrayList, orLinkedHashSetwas chosen for mutability, ordering, membership checks, or by habit.Code before the prompt was executed
The code had several
toCollection(...)uses without comments. Some were justified but unexplained:Another site required encounter-order-preserving de-duplication, but that reason was not visible:
The stream-formatting regression test also used
toCollection(...)in sample text even though the sample only needed any collector-style terminal operation:Prompt that caused the implementation
The user asked:
Prompt-produced code before maintainer correction
There was no new weak implementation in response to this prompt; the issue was that existing code and one earlier formatter-test sample used
toCollection(...)without making the collector choice self-explanatory.A first attempt simplified one collector to
Collectors.toSet(), but that violated the repository'sCollectorMutabilityrule because JDKCollectors.toSet()does not make mutability explicit:Why the prompt-produced code is bad
The original
toCollection(...)uses were not necessarily incorrect. The problem is that they required readers to infer intent:ArrayList::newwas needed because the list is appended to later.HashSet::newwas needed at some sites because the code later mutates the set or performs membership checks.LinkedHashSet::newwas needed because diagnostics de-duplicate while preserving encounter order.HashSet::newuses were not actually necessary and could be simplified.The intermediate
Collectors.toSet()simplification was also not suitable in a codebase that enforces explicit collector mutability.Maintainer-preferred code
Keep
toCollection(...)only where the concrete collection type matters, and add a short rationale comment nearby:When the concrete implementation does not matter, use an explicit simpler collector or terminal operation instead. In this repository, use Guava immutable collectors rather than ambiguous JDK
Collectors.toSet():For a formatter-test sample that only needs a collector-style terminal operation, avoid unnecessary
toCollection(...)entirely:Why the replacement is better
The replacement distinguishes semantic collector requirements from habit:
toCollection(...)sites document mutability, membership, or order requirements.CollectorMutabilityby avoiding ambiguous JDKCollectors.toSet()in compiled code.Desired eval behavior
Collectors.toCollection(...)and asking whether the concrete collection implementation is needed.toCollection(...)uses.toCollection(...)with a concise rationale comment when mutability, encounter order, sorted order, or membership behavior matters.Collectors.toSet().Anti-patterns the eval should reject
toCollection(...)without explaining why that concrete type matters.toCollection(...)mechanically withtoSet()ortoList()without checking mutability or order requirements.// collect to setthat restate the code instead of explaining the contract.LinkedHashSet::newwithout saying that encounter order is required.Collectors.toSet()in a repository that enforces explicit collector mutability.Suggested eval name
collectors-to-collection-rationale