Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions .cursor/rules/building.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
---
description: Maven build instructions for the NiFi codebase
alwaysApply: true
---

# Building

NiFi is a complex Maven codebase. Never build code (testing or otherwise) using javac.
Always use `mvn` instead, or preferably the `.mvnw` wrapper script.

Additionally, building a maven module using the also-make flag (`-am`) is often very
expensive and slow. Instead, only build the specific module you are modifying. Assume that
the user has already built the entire codebase and that only the specific module you are
modifying needs to be built again. If this fails, you can prompt the user to build the entire
codebase, but only after you have attempted to build the relevant modules yourself first.
It is important not to run `mvn clean` at the root level or at the `nifi-assembly` level without
the user's express permission, as this may delete a running instance of NiFi, causing permanent
loss of flows and configuration.
74 changes: 74 additions & 0 deletions .cursor/rules/code-style.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
description: Java code style conventions for the NiFi codebase
globs: "**/*.java"
alwaysApply: false
---

# Code Style

NiFi adheres to a few code styles that are not necessarily common. Please ensure that you
observe these code styles.

1. Any variable that can be marked `final` must be marked `final`. This includes
declarations of Exceptions, method arguments, local variables, member variables, etc.
2. Short-hand is highly discouraged in names of variables, classes, methods, etc., as well
as in documentation. Exceptions to this include in the framework, you may see references to
`procNode` for `ProcessorNode` or other such short-hand that is very difficult to confuse with
other terms, and it is used only when clearly defined such as `final ProcessorNode procNode = ...`.
Even though, however, we would not abbreviate `ControllerService` as `cs` because `cs` is too vague
and easily misunderstood. Instead, a value of `serviceNode` might be used.
3. Private / helper methods should not be placed before the first public/protected method
that calls it.
4. Unless the method is to be heavily reused, avoid creating trivial 1-2 line methods and
instead just place the code inline.
5. Code is allowed to be up to 200 characters wide. Avoid breaking lines into many short lines.
6. Avoid creating private methods that are called only once unless they are at least 10
lines long or are complex.
7. It is never acceptable to use star imports. Import each individual class that is to be used.
8. Never use underscores in class names, variables, or filenames.
9. Never use System.out.println but instead use SLF4J Loggers.
10. Avoid excessive whitespace in method invocations. For example, instead of writing:

```java
myObject.doSomething(
arg1,
arg2,
arg3,
arg4,
arg5
);
```

Write this instead:

```java
myObject.doSomething(arg1, arg2, arg3, arg4, arg5);
```

It is okay to use many newlines in a builder pattern, such as:
```java
final MyObject myObject = MyObject.builder()
.arg1(arg1)
.arg2(arg2)
.arg3(arg3)
.build();
```

It is also acceptable when chaining methods in a functional style such as:
```java
final List<String> result = myList.stream()
.filter(s -> s.startsWith("A"))
.map(String::toUpperCase)
.toList();
```

11. When possible, prefer importing a class, rather than using fully qualified classname
inline in the code.
12. Avoid statically importing methods, except in methods that are frequently used in testing
frameworks, such as the `Assertions` and `Mockito` classes.
13. Avoid trailing whitespace at the end of lines, especially in blank lines.
14. The `var` keyword is never allowed in the codebase. Always explicitly declare the type of variables.
15. Prefer procedural code over functional code. For example, prefer using a for loop instead of a stream
when the logic is not simple and straightforward. The stream API is powerful but can be difficult to
read when overused or used in complex scenarios. Functional style is best used when the logic is simple
and chains together no more than 3-4 operations.
26 changes: 26 additions & 0 deletions .cursor/rules/ending-conditions.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
description: Task completion checklist that must be verified before considering any task done
alwaysApply: true
---

# Ending Conditions

When you have completed a task, ensure that you have verified the following:

1. All code compiles and builds successfully using `mvn`.
2. All relevant unit tests pass successfully using `mvn`.
3. All code adheres to the Code Style rules.
4. Checkstyle and PMD pass successfully using
`mvn checkstyle:check pmd:check -T 1C` from the appropriate directory.
5. Unit tests have been added to verify the functionality of any sufficiently complex method.
6. A system test or an integration test has been added if the change makes significant
changes to the framework and the interaction between a significant number of classes.
7. You have performed a full review of the code to ensure that there are no logical errors
and that the code is not duplicative or difficult to understand. If you find any code that
is in need of refactoring due to clarity or duplication, you should report this to the user
and offer to make those changes as well.

Do not consider the task complete until all of the above conditions have been met. When you
do consider the task complete, provide a summary of what you changed and which tests were
added or modified and what the behavior is that they verify. Additionally, provide any feedback
about your work that may need further review or that is not entirely complete.
80 changes: 80 additions & 0 deletions .cursor/rules/extension-development.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
---
description: Development patterns for NiFi extensions (Processors, Controller Services, Connectors). Covers Property Descriptors, Relationships, and common patterns.
alwaysApply: false
---

# Extension Development

This rule applies when developing NiFi extensions: Processors, Controller Services, and
Connectors.

## Property Descriptors

Property Descriptors are defined as `static final` fields on the component class using
`PropertyDescriptor.Builder`.

- **Naming:** Use clear, descriptive names. The `displayName` field should never be used. Make the
name itself clear and concise. Use Title Case for property names.
- **Required vs. optional:** Mark properties as `.required(true)` when the component cannot
function without them. Prefer sensible defaults via `.defaultValue(...)` when possible.
When a default value is provided, the property will always have a value. The `required` flag in this
case is more of a documentation aid to indicate the importance of the property.
- **Validators:** Always attach an appropriate `Validator` (e.g., `StandardValidators.NON_EMPTY_VALIDATOR`,
`StandardValidators.POSITIVE_INTEGER_VALIDATOR`). The Validator can be left off only when Allowable Values
are provided. In this case, do not include a Validator because it is redundant and confusing.
- **Expression Language:** If a property should support Expression Language, add
`.expressionLanguageSupported(ExpressionLanguageScope.FLOWFILE_ATTRIBUTES)` or the
appropriate scope. Always document when Expression Language is supported in the property
description. Some developers tend to go overboard here and feel like Expression Language should be supported
everywhere, but this is a mistake! The default assumption should be that Expression Language is not supported
unless the value is expected to be different for every FlowFile that is processed.
- **Dependencies:** Use `.dependsOn(...)` to conditionally show properties based on the
values of other properties. This keeps the configuration UI clean and avoids exposing
irrelevant properties. If there is a dependency, it is important to understand that `.required(true)` means that
this property is required IF AND ONLY IF the dependency condition is met.

## Processor Lifecycle Annotations

- Use `@OnScheduled` for setup that should happen once before the processor starts
running (e.g., creating clients, compiling patterns).
- Use `@OnStopped` for cleanup (e.g., closing clients, releasing resources).
- `@OnUnscheduled` is rarely used but can be used to interrupt long-running processes when the Processor is stopped.
Generally, though, it is preferable to write the Processor in such a way that long-running processes check `isScheduled()`
and stop gracefully if the return value is `false`.

## Processors

- The `onTrigger` method should be focused on processing FlowFiles. Keep setup and teardown
logic in lifecycle methods when possible.
- Prefer `session.read()` and `session.write()` with callbacks over directly working with
streams to ensure proper resource management.
- Prefer `session.commitAsync()` over `session.commit()`. The `commit` method was the original implementation,
but it has now been deprecated in favor of `commitAsync`. The `commitAsync` call provide a clearer, cleaner
interface for handling post-commit actions including success and failure callbacks. In addition, the async
method allows Processors to be used much more efficiently in a Stateless NiFi flow.

### Relationships
- **Declaration**: Relationships are defined as `static final` fields using `new Relationship.Builder()`.
Relationship names should generally be lowercase.
- **Success and Failure:** Most processors define at least a `success` and `failure`
relationship. Use `REL_SUCCESS` and `REL_FAILURE` as constant names.
- **Original relationship:** Processors that enrich or fork FlowFiles often include an
`original` relationship for the unmodified input FlowFile.


## Controller Services

Controller Services are objects that can be shared across multiple components. This is typically done for
clients that connect to external systems in order to avoid creating many connections, or in order to share
configuration across multiple components without the user having to duplicate configuration. Controller Services
can also be helpful for abstracting away some piece of functionality into a separate extension point so that the
implementation can be swapped out by the user. For example, Record Readers and Writers are implemented as Controller
Services so that the user can simply choose which format they want to read and write in a flexible and reusable way.

That said, Controller Services can be more onerous to configure and maintain for the user, so they should
be used sparingly and only when there is a clear benefit to doing so.

## General Patterns

- Use `ComponentLog` (obtained via `getLogger()`) for all logging, not SLF4J directly.
This ensures log messages are associated with the component instance and that they generate Bulletins.
80 changes: 80 additions & 0 deletions .cursor/rules/extension-testing.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
---
description: Testing guidance for NiFi extensions (Processors, Controller Services, Connectors). Covers nifi-mock and TestRunner usage.
alwaysApply: false
---

# Extension Testing

This rule applies when writing tests for NiFi extensions: Processors, Controller Services, and Connectors.

## Unit Tests

Unit tests should be used to test individual classes and methods in isolation. This often
will result in mocking dependency classes. However, if there already exists a Mock
implementation of an interface or dependency class, it is preferred to use the existing
Mock implementation. Similarly, for simple classes, it is preferable to make use of the
real implementation of a class rather than creating a Mock implementation. We are infinitely
more interested in having tests that are fast, reliable, correct, and easy to maintain than
we are in having tests that adhere to strict and arbitrary definitions of what constitutes
a "unit test."

## Use nifi-mock

Tests for extensions should always make use of the `nifi-mock` mocking framework. This is
done through the `TestRunner` interface and its standard implementation, obtained via
`TestRunners.newTestRunner(processor)`.

The `TestRunner` provides methods for:
- Setting property values (`setProperty`)
- Enqueueing FlowFiles (`enqueue`)
- Running the processor (`run`)
- Asserting transfer to relationships (`assertTransferCount`, `assertAllFlowFilesTransferred`)
- Validating processor configuration (`assertValid`, `assertNotValid`)
- Asserting content and attributes of FlowFiles (`assertContentEquals`, `assertAttributeEquals`, etc.)

## No System Tests for Extensions

System tests are not expected for extensions. Extensions are tested at the unit level using
`nifi-mock`. The `nifi-mock` framework provides sufficient isolation and simulation of the
NiFi runtime environment.

## What to Test

- **Property validation:** If the extension has a custom Validator, it
- **customValidate:** If the extension overrides the `customValidate` method, test that it correctly
validates the configuration and produces appropriate validation results.
- **Relationship routing:** Verify that FlowFiles are routed to the correct relationship
based on input and configuration.
- **Content transformation:** For processors that modify FlowFile content, verify that
output content matches expectations.
- **Attribute handling:** Verify that expected attributes are set on output FlowFiles.
- **Error handling:** Verify that error conditions (bad input, misconfiguration, simulated
failures) are handled correctly, typically by routing to a failure relationship.

## What NOT to Test

- **NiFi framework behavior:** Do not attempt to test the behavior of the NiFi framework itself.
For example, do not test that `session.commitAsync()` actually commits a transaction. Instead,
focus on testing that your extension behaves correctly when `commitAsync` is called, and trust
that the NiFi framework will handle the commit correctly.
- **Validator behavior:** If a custom validator is used by an extension, that custom validator should
be tested separate as a unit test for the validator itself. However, if the extension point provides
a `customValidate` method, that should absolutely be tested as part of the extension's unit tests.
- **The PropertyDescriptors that are returned:** Do not test that the `getSupportedPropertyDescriptors`
method returns the expected PropertyDescriptors. This is an anti-pattern because it does not properly
test that the extension abides by the contract of the API. For example, if a new PropertyDescriptor is
added whose default is to behave the same way as the old behavior, the test should absolutely pass.
However, if the test is written to expect a specific set of PropertyDescriptors, then the test will fail,
leading to confusion and unnecessary maintenance.

## Controller Service Testing

When a processor depends on a Controller Service, use `TestRunner.addControllerService`
and `TestRunner.enableControllerService` to wire up either a real or mock implementation
of the service for testing.

## TestContainers

For Processors that interact with external systems, it can be helpful to use TestContainers to spin up
a temporary instance of the external system for testing. This allows for more realistic integration tests
without requiring the user to have the external system installed and running on their machine.
46 changes: 46 additions & 0 deletions .cursor/rules/framework-testing.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
description: Testing guidance for NiFi framework code (not extensions). Covers when to use unit, integration, and system tests for framework classes.
alwaysApply: false
---

# Framework Testing

This rule applies when working on NiFi framework code (not Processors, Controller
Services, or Connectors).

## Unit Tests

Unit tests should be used to test individual classes and methods in isolation. This often
will result in mocking dependency classes. However, if there already exists a Mock
implementation of an interface or dependency class, it is preferred to use the existing
Mock implementation. Similarly, for simple classes, it is preferable to make use of the
real implementation of a class rather than creating a Mock implementation. We are infinitely
more interested in having tests that are fast, reliable, correct, and easy to maintain than
we are in having tests that adhere to strict and arbitrary definitions of what constitutes
a "unit test."

## Integration Tests

When working in the framework, unit tests are still important, but integration tests and
system tests are often more important. Integration tests are still allowed to use mocks but
typically we prefer to use real implementations of classes in order to ensure a more
realistic and holistic test.

## System Tests

System tests live in the `nifi-system-tests` module and should be used for any changes
that make significant changes to the framework and the interaction between a significant
number of classes. They should also be used for any changes that may be fairly isolated but
which are in a critical path of the framework, especially those that affect how data is
persisted, processed, or accessed; or those that affect how components are created,
configured, scheduled, or executed.

Good candidates for system tests include changes to `ProcessScheduler`, `ProcessorNode`,
`ControllerServiceNode`, `FlowController`, `FlowManager`, how Parameters are handled, flow
synchronization, the repositories, etc.

## Escalation

Any unit test that ends up requiring a large number of mocks is a good candidate for an
integration test, and any integration test that ends up requiring a large number of mocks
is a good candidate for a system test.
16 changes: 16 additions & 0 deletions .cursor/rules/persona.mdc
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
---
description: AI persona and general approach for working on the Apache NiFi codebase
alwaysApply: true
---

# AI Persona

Act as an experienced Java software engineer. When considering how to implement a task,
first consider the big picture of what is being asked. Then determine which classes will
need to be updated.

Quite often, a single request will require manipulating many different classes. Generally
speaking, it is best to avoid changing established interfaces, especially those in nifi-api.
It is acceptable when necessary, but any change in nifi-api needs to be backward compatible.
For example, you might introduce a new method with a default implementation, or add a new method
and deprecate an old one without removing it.
Loading
Loading