Skip to content

Commit e1f7c56

Browse files
committed
pipeline: outputs: es: support of Upstream
Signed-off-by: Marat Abrarov <abrarov@gmail.com>
1 parent 8794570 commit e1f7c56

3 files changed

Lines changed: 149 additions & 48 deletions

File tree

administration/configuring-fluent-bit/classic-mode/upstream-servers.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ Fluent Bit [output plugins](../../../pipeline/outputs.md) aim to connect to exte
55
An `Upstream` defines a set of nodes that will be targeted by an output plugin, by the nature of the implementation an output plugin must support the `Upstream` feature. The following plugin has `Upstream` support:
66

77
- [Forward](../../../pipeline/outputs/forward.md)
8+
- [Elasticsearch](../../../pipeline/outputs/elasticsearch.md)
89

910
The current balancing mode implemented is `round-robin`.
1011

pipeline/outputs/elasticsearch.md

Lines changed: 147 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -8,54 +8,63 @@ The _Elasticsearch_ (`es`) output plugin lets you ingest your records into an [E
88

99
## Configuration parameters
1010

11-
| Key | Description | Default |
12-
| :--- | :--- | :--- |
13-
| `aws_auth` | Enable AWS Sigv4 Authentication for Amazon OpenSearch Service. | `Off` |
14-
| `aws_external_id` | External ID for the AWS IAM Role specified with `aws_role_arn`. | _none_ |
15-
| `aws_profile` | AWS profile name. | _none_ |
16-
| `aws_region` | Specify the AWS region for Amazon OpenSearch Service. | _none_ |
17-
| `aws_role_arn` | AWS IAM Role to assume to put records to your Amazon cluster. | _none_ |
18-
| `aws_service_name` | Service name to use in AWS Sigv4 signature. For integration with Amazon OpenSearch Serverless, set to `aoss`. See [Amazon OpenSearch Serverless](opensearch.md) for more information. | `es` |
19-
| `aws_sts_endpoint` | Specify the custom STS endpoint to be used with STS API for Amazon OpenSearch Service. | _none_ |
20-
| `buffer_size` | Specify the buffer size used to read the response from the Elasticsearch HTTP service. Use for debugging purposes where required to read full responses. Response size grows depending of the number of records inserted. To use an unlimited amount of memory, set this value to `False`. Otherwise set the value according to the [Unit Size](../../administration/configuring-fluent-bit.md#unit-sizes). | `512k` |
21-
| `cloud_auth` | Specify the credentials to use to connect to Elastic's Elasticsearch Service running on Elastic Cloud. | _none_ |
22-
| `cloud_id` | If using Elastic's Elasticsearch Service you can specify the `cloud_id` of the cluster running. The string has the format `<deployment_name>:<base64_info>`. Once decoded, the `base64_info` string has the format `<deployment_region>$<elasticsearch_hostname>$<kibana_hostname>`. | _none_ |
23-
| `compress` | Set payload compression mechanism. Option available is `gzip`. | _none_ |
24-
| `current_time_index` | Use current time for index generation instead of message record. | `Off` |
25-
| `generate_id` | When enabled, generate `_id` for outgoing records. This prevents duplicate records when retrying ES. | `Off` |
26-
| `host` | IP address or hostname of the target Elasticsearch instance. | `127.0.0.1` |
27-
| `http_api_key` | API key for authenticating with Elasticsearch. Must be `base64` encoded. If `http_user` or `cloud_auth` are defined, this parameter is ignored. | _none_ |
28-
| `http_passwd` | Password for user defined in `http_user`. | _none_ |
29-
| `http_user` | Optional username credential for Elastic X-Pack access. | _none_ |
30-
| `id_key` | If set, `_id` will be the value of the key from the incoming record and `generate_id` option is ignored. | _none_ |
31-
| `include_tag_key` | When enabled, appends the Tag name to the record. | `Off` |
32-
| `index` | Index name. | `fluent-bit` |
33-
| `logstash_dateformat` | Time format based on [strftime](https://man7.org/linux/man-pages/man3/strftime.3.html) to generate the second part of the Index name. | `%Y.%m.%d` |
34-
| `logstash_format` | Enable Logstash format compatibility. This option takes a Boolean value: `True/False`, `On/Off`. | `Off` |
35-
| `logstash_prefix` | When `logstash_format` is enabled, the Index name is composed using a prefix and the date. For example, if `logstash_prefix` is equal to `mydata` your index will become `mydata-YYYY.MM.DD`. The last string appended belongs to the date when the data is being generated. | `logstash` |
36-
| `logstash_prefix_key` | When included, the value of the key in the record is evaluated as a key reference and overrides `logstash_prefix` for index generation. If the key/value isn't found in the record then the `logstash_prefix` option acts as a fallback. The parameter is expected to be a [record accessor](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md). | _none_ |
37-
| `logstash_prefix_separator` | Set a separator between `logstash_prefix` and date. | `-` |
38-
| `path` | Elasticsearch accepts new data on HTTP query path `/_bulk`. You can also serve Elasticsearch behind a reverse proxy on a sub-path. Define the path by adding a path prefix in the indexing HTTP POST URI. | _none_ |
39-
| `pipeline` | Define which pipeline the database should use. For performance reasons, it's strongly suggested to do parsing and filtering on Fluent Bit side, and avoid pipelines. | _none_ |
40-
| `port` | TCP port of the target Elasticsearch instance. | `9200` |
41-
| `replace_dots` | When enabled, replace field name dots with underscore. Required by Elasticsearch 2.0-2.3. | `Off` |
42-
| `suppress_type_name` | When enabled, mapping types is removed and `type` option is ignored. Elasticsearch 8.0.0 or higher [no longer supports mapping types](https://www.elastic.co/docs/manage-data/data-store/mapping/removal-of-mapping-types). | `Off` |
43-
| `tag_key` | When `include_tag_key` is enabled, this property defines the key name for the tag. | `flb-key` |
44-
| `time_key` | When `logstash_format` is enabled, each record will get a new timestamp field. The `time_key` property defines the name of that field. | `@timestamp` |
45-
| `time_key_format` | When `logstash_format` is enabled, this property defines the format of the timestamp. | `%Y-%m-%dT%H:%M:%S` |
46-
| `time_key_nanos` | When `logstash_format` is enabled, enabling this property sends nanosecond precision timestamps. | `Off` |
47-
| `trace_error` | If Elasticsearch returns an error, print the Elasticsearch API request and response for diagnostics. | `Off` |
48-
| `trace_output` | Print all Elasticsearch API request payloads to `stdout` for diagnostics. | `Off` |
49-
| `type` | Type name. | `_doc` |
50-
| `workers` | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` |
51-
| `write_operation` | Operation type for records. Can be any of: `create`, `index`, `update`, `upsert`. | `create` |
11+
The **Overrides allowed** column indicates whether a key can be overridden in the `NODE` section of an
12+
[Upstream](../../administration/configuring-fluent-bit/classic-mode/upstream-servers.md)
13+
configuration.
14+
15+
| Key | Description | Default | Allows overrides |
16+
| :--- | :--- | :--- | :--- |
17+
| `aws_auth` | Enable AWS SigV4 Authentication for Amazon OpenSearch Service. | `Off` | Yes |
18+
| `aws_external_id` | External ID for the AWS IAM Role specified with `aws_role_arn`. | _none_ | Yes |
19+
| `aws_profile` | AWS profile name. | _none_ | Yes |
20+
| `aws_region` | Specify the AWS region for Amazon OpenSearch Service. | _none_ | Yes |
21+
| `aws_role_arn` | AWS IAM Role to assume to put records to your Amazon cluster. | _none_ | Yes |
22+
| `aws_service_name` | Service name to use in AWS SigV4 signature. For integration with Amazon OpenSearch Serverless, set to `aoss`. See [Amazon OpenSearch Serverless](opensearch.md) for more information. | `es` | Yes |
23+
| `aws_sts_endpoint` | Specify the custom STS endpoint to be used with STS API for Amazon OpenSearch Service. | _none_ | Yes |
24+
| `buffer_size` | Specify the buffer size used to read the response from the Elasticsearch HTTP service. Use for debugging purposes where required to read full responses. Response size grows depending of the number of records inserted. To use an unlimited amount of memory, set this value to `False`. Otherwise set the value according to the [Unit Size](../../administration/configuring-fluent-bit.md#unit-sizes). | `512k` | Yes |
25+
| `cloud_auth` | Specify the credentials to use to connect to Elastic's Elasticsearch Service running on Elastic Cloud. | _none_ | Yes |
26+
| `cloud_id` | If using Elastic's Elasticsearch Service you can specify the `cloud_id` of the cluster running. The string has the format `<deployment_name>:<base64_info>`. After decoding, the `base64_info` string has the format `<deployment_region>$<elasticsearch_hostname>$<kibana_hostname>`. | _none_ | No |
27+
| `compress` | Set payload compression mechanism. Option available is `gzip`. | _none_ | Yes |
28+
| `current_time_index` | Use current time for index generation instead of message record. | `Off` | Yes |
29+
| `generate_id` | When enabled, generate `_id` for outgoing records. This prevents duplicate records when retrying ES. | `Off` | Yes |
30+
| `host` | IP address or hostname of the target Elasticsearch instance. | `127.0.0.1` | Yes. Default value isn't applicable for `NODE` section of Upstream configuration, which requires `host` to be specified. |
31+
| `http_api_key` | API key for authenticating with Elasticsearch. Must be `base64` encoded. If `http_user` or `cloud_auth` are defined, this parameter is ignored. | _none_ | Yes |
32+
| `http_passwd` | Password for user defined in `http_user`. | _none_ | Yes |
33+
| `http_user` | Optional username credential for Elastic X-Pack access. | _none_ | Yes |
34+
| `id_key` | If set, `_id` is the value of the key from incoming record, and `generate_id` option is ignored. | _none_ | Yes |
35+
| `include_tag_key` | When enabled, it appends the Tag name to the record. | `Off` | Yes |
36+
| `index` | Index name. | `fluent-bit` | Yes |
37+
| `logstash_dateformat` | Time format based on [strftime](https://man7.org/linux/man-pages/man3/strftime.3.html) to generate the second part of the Index name. | `%Y.%m.%d` | Yes |
38+
| `logstash_format` | Enable Logstash format compatibility. This option takes a Boolean value: `True/False`, `On/Off`. | `Off` | Yes |
39+
| `logstash_prefix` | When `logstash_format` is enabled, the Index name is composed using a prefix and the date. For example, if `logstash_prefix` is equal to `mydata`, your index becomes `mydata-YYYY.MM.DD`. The last string appended belongs to the date when the data is being generated. | `logstash` | Yes |
40+
| `logstash_prefix_key` | When included: the value of the key in the record will be evaluated as key reference and overrides `logstash_prefix` for index generation. If the key/value isn't found in the record, the `logstash_prefix` option will act as a fallback. The parameter is expected to be a [record accessor](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md). | _none_ | Yes |
41+
| `logstash_prefix_separator` | Set a separator between `logstash_prefix` and date. | `-` | Yes |
42+
| `path` | Elasticsearch accepts new data on HTTP query path `/_bulk`. You can also serve Elasticsearch behind a reverse proxy on a sub-path. Define the path by adding a path prefix in the indexing HTTP POST URI. | _none_ | Yes |
43+
| `pipeline` | Define which pipeline the database should use. For performance reasons, it's strongly suggested to do parsing and filtering on Fluent Bit side, and avoid pipelines. | _none_ | Yes |
44+
| `port` | TCP port of the target Elasticsearch instance. | `9200` | Yes. Default value isn't applicable for `NODE` section of Upstream configuration, which requires `port` to be specified. |
45+
| `replace_dots` | When enabled, replace field name dots with underscore. Required by Elasticsearch 2.0-2.3. | `Off` | Yes |
46+
| `suppress_type_name` | When enabled, mapping types is removed and `type` option is ignored. Elasticsearch 8.0.0 or later [no longer supports mapping types](https://www.elastic.co/docs/manage-data/data-store/mapping/removal-of-mapping-types), which requires this value to be `On`. | `Off` | Yes |
47+
| `tag_key` | When `include_tag_key` is enabled, this property defines the key name for the tag. | `flb-key` | Yes |
48+
| `time_key` | When `logstash_format` is enabled, each record gets a new timestamp field. The `time_key` property defines the name of that field. | `@timestamp` | Yes |
49+
| `time_key_format` | When `logstash_format` is enabled, this property defines the format of the timestamp. | `%Y-%m-%dT%H:%M:%S` | Yes |
50+
| `time_key_nanos` | When `logstash_format` is enabled, enabling this property sends nanosecond precision timestamps. | `Off` | Yes |
51+
| `trace_error` | If Elasticsearch returns an error, print the Elasticsearch API request and response for diagnostics. | `Off` | Yes |
52+
| `trace_output` | Print all Elasticsearch API request payloads to `stdout` for diagnostics. | `Off` | Yes |
53+
| `type` | Type name. | `_doc` | Yes |
54+
| `upstream` | If plugin will connect to an `upstream` instead of a basic host, this property defines the path for the Upstream configuration file, for more details about this, see [Upstream Servers](../../administration/configuring-fluent-bit/classic-mode/upstream-servers.md). | _none_ | No |
55+
| `workers` | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` | No |
56+
| `write_operation` | Operation type for records. Can be any of: `create`, `index`, `update`, `upsert`. | `create` | Yes |
5257

5358
If you have used a common relational database, the parameters `index` and `type` can be compared to the `database` and `table` concepts.
5459

5560
### TLS / SSL
5661

5762
The Elasticsearch output plugin supports TLS/SSL. For more details about the properties available and general configuration, see [TLS/SSL](../../administration/transport-security.md).
5863

64+
### AWS SigV4 authentication and upstream servers
65+
66+
The `http_proxy`, `no_proxy`, and `TLS` parameters used for AWS SigV4 Authentication (for connection of plugin to AWS to generate authentication signature) are never picked from the `NODE` section of the [Upstream](../../administration/configuring-fluent-bit/classic-mode/upstream-servers.md) configuration. However, `TLS` parameters for connection of the plugin to Elasticsearch can be overridden in the `NODE` section of Upstream, even if AWS authentication is used.
67+
5968
### `write_operation`
6069

6170
The `write_operation` can be any of:
@@ -147,6 +156,96 @@ pipeline:
147156
{% endtab %}
148157
{% endtabs %}
149158

159+
### Configuration file with upstream
160+
161+
#### Classic mode configuration file with upstream
162+
163+
In your main classic mode configuration file append the following `Input` and `Output` sections:
164+
165+
```text
166+
[INPUT]
167+
Name dummy
168+
Dummy { "message" : "this is dummy data" }
169+
170+
[OUTPUT]
171+
Name es
172+
Match *
173+
Upstream ./upstream.conf
174+
Index my_index
175+
Type my_type
176+
```
177+
178+
Your [Upstream Servers](../../administration/configuring-fluent-bit/classic-mode/upstream-servers.md)
179+
configuration file can be similar to the following:
180+
181+
```text
182+
[UPSTREAM]
183+
name es-balancing
184+
185+
[NODE]
186+
name node-1
187+
host localhost
188+
port 9201
189+
190+
[NODE]
191+
name node-2
192+
host localhost
193+
port 9202
194+
195+
[NODE]
196+
name node-3
197+
host localhost
198+
port 9203
199+
```
200+
201+
#### YAML configuration file with upstream
202+
203+
In your main YAML configuration file (fluent-bit.yaml) put the following `Input` and `Output` sections:
204+
205+
```yaml
206+
pipeline:
207+
inputs:
208+
- name: dummy
209+
dummy: "{ \"message\" : \"this is dummy data\" }"
210+
outputs:
211+
- name: es
212+
match: "*"
213+
index: fluent-bit
214+
type: my_type
215+
upstream: ./upstream.yaml
216+
```
217+
218+
Your Upstream Servers configuration file can use
219+
[classic mode](../../administration/configuring-fluent-bit/classic-mode/upstream-servers.md)
220+
(refer to "Classic mode Configuration File with Upstream" section at this page) or
221+
[YAML format](../../administration/configuring-fluent-bit/yaml/upstream-servers-section.md).
222+
If Upstream Servers configuration uses YAML format, then it can be placed in the same file as main configuration (for example, in fluent-bit.yaml), like:
223+
224+
```yaml
225+
pipeline:
226+
inputs:
227+
- name: dummy
228+
dummy: "{ \"message\" : \"this is dummy data\" }"
229+
outputs:
230+
- name: es
231+
match: "*"
232+
index: fluent-bit
233+
type: my_type
234+
upstream: ./fluent-bit.yaml
235+
upstream_servers:
236+
- name: es-balancing
237+
nodes:
238+
- name: node-1
239+
host: localhost
240+
port: 9201
241+
- name: node-2
242+
host: localhost
243+
port: 9202
244+
- name: node-3
245+
host: localhost
246+
port: 9203
247+
```
248+
150249
## Elasticsearch field names
151250
152251
Some input plugins can generate messages where the field names contains dots (`.`). For Elasticsearch 2.0, this isn't allowed. The current `es` plugin replaces a dot with an underscore (`_`):
@@ -161,13 +260,13 @@ becomes
161260
{"cpu0_p_cpu"=>17.000000}
162261
```
163262

164-
## Use Fluent Bit ElasticSearch plugin with other services
263+
## Use Fluent Bit Elasticsearch plugin with other services
165264

166-
Connect to Amazon OpenSearch or Elastic Cloud with the ElasticSearch plugin.
265+
Connect to Amazon OpenSearch or Elastic Cloud with the Elasticsearch plugin.
167266

168267
### Amazon OpenSearch Service
169268

170-
The Amazon OpenSearch Service adds an extra security layer where HTTP requests must be signed with AWS Sigv4. Fluent Bit v1.5 introduced full support for Amazon OpenSearch Service with IAM Authentication.
269+
The Amazon OpenSearch Service adds an extra security layer where HTTP requests must be signed with AWS SigV4. Fluent Bit v1.5 introduced full support for Amazon OpenSearch Service with IAM Authentication.
171270

172271
See [details](../../administration/aws-credentials.md) on how AWS credentials are fetched.
173272

@@ -210,7 +309,7 @@ pipeline:
210309
{% endtab %}
211310
{% endtabs %}
212311

213-
Be aware that the `Port` is set to `443`, `tls` is enabled, and `AWS_Region` is set.
312+
Be aware that the `port` is set to `443`, `tls` is enabled, and `AWS_Region` is set.
214313

215314
### Use Fluent Bit with Elastic Cloud
216315

@@ -263,7 +362,7 @@ Without this you will see errors like:
263362

264363
## Troubleshooting
265364

266-
Use the following information to help resolve errors using the ElasticSearch plugin.
365+
Use the following information to help resolve errors using the Elasticsearch plugin.
267366

268367
### Using multiple types in a single index
269368

@@ -368,7 +467,7 @@ In Fluent Bit v1.8.2 and greater, Fluent Bit started using `create` method (inst
368467
Validation Failed: 1: an id must be provided if version type or value are set
369468
```
370469

371-
If you see `action_request_validation_exception` errors on your pipeline with Fluent Bit versions greater than v1.8.2, correct them by turning on `Generate_ID` as follows:
470+
If you see `action_request_validation_exception` errors on your pipeline with Fluent Bit versions greater than v1.8.2, correct them by turning on `generate_id` as follows:
372471

373472
{% tabs %}
374473
{% tab title="fluent-bit.yaml" %}
@@ -397,7 +496,7 @@ pipeline:
397496
{% endtab %}
398497
{% endtabs %}
399498

400-
### `Logstash_Prefix_Key`
499+
### `logstash_prefix_key`
401500

402501
The following snippet demonstrates using the namespace name as extracted by the `kubernetes` filter as `logstash` prefix:
403502

vale-styles/FluentBit/Headings.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ exceptions:
2525
- AWS
2626
- AWS MSK IAM
2727
- AWS IAM
28+
- AWS SigV4
2829
- Azure
2930
- Azure Blob
3031
- Azure Data Explorer

0 commit comments

Comments
 (0)