Skip to content

fix: config environment override improvements#27312

Open
gwossum wants to merge 62 commits intomaster-1.xfrom
gw/improved_env_overrides
Open

fix: config environment override improvements#27312
gwossum wants to merge 62 commits intomaster-1.xfrom
gw/improved_env_overrides

Conversation

@gwossum
Copy link
Copy Markdown
Member

@gwossum gwossum commented Mar 27, 2026

  • Environment overrides with group names and group IDs now work correctly.
  • Slices can be extended using environment variables. Previously they could only overwrite existing elements.
  • Environment variables properly override non-string slices in all cases.
  • Prevent possible nil dereference.
  • Better errors for size parsing errors.
  • Improved error messages when a TextUnmarshaler fails.
  • Trim whitespace from environment variable values.
  • Greatly improved test coverage.

Closes: #22974

This comment was marked as outdated.

Comment thread toml/toml.go
Comment on lines +301 to +306
// Apply the unindexed environment variable as a default value, if available.
if len(value) > 0 {
if err := applyEnvOverrides(getenv, prefix, f, structKey); err != nil {
return err
}
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior of using the unindexed environment value as a default for existing slice values is very strange, but it is what OSS and plutonium have both been doing. It's not intuitive and confusing. Claude doesn't like it either.

Is this intentional or a used feature? Or could this be improved?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a test that explicitly tests for this, so it does seem intentional. Whether it is useful or problematic is another issue.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other very odd thing this does: Let's say I have a have:

type Config struct {
    values []string `toml:"values"`
}

If I start out with:

Config{
    values: []string{"one", "two", "three"},
}

and I have an environment variable INFLUXDB_VALUES="alice,bob,carol", then you end up with:

Config{
    values: []string{"alice,bob,carol", "alice,bob,carol", "alice,bob,carol"
}

If, however, you start out with:

Config{
    values: []string{},
}

and an environment variable INFLUXDB_VALUES="alice,bob,carol", then you end up with:

Config{
    values: []string{"alice", "bob", "carol"}
}

This is highly inconsistent behavior and very non-intuitive!

I think the reason the default is supported is for configuring common options across a slice of configurations, for instance UDP buffer sizes across all UDP inputs. It makes sense when you have a slice of structs, but behaves weirdly when you have a slice of simple types like strings. Maybe we handle the two cases differently?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work the same in Plutonium with a separate implementation, or does Plutonium rely on this implementation, or does it behave differently?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... Plutonium has the same behavior WRT to using the unindexed version of a variable to override a value for all elements in slice. For instance, INFLUXDB_UDP_DATABASE is a default for overriding each udp input, while INFLUXDB_UDP_1_DATABASE is a specific override for the second UDP input.

However, plutonium does not support a CSV list to set an array at all, so the behavior in https://github.com/influxdata/influxdb/pull/27312/changes#r3018901388 does not apply to plutonium.

Since the CSV setting would only work with slices of primitive types (e.g. []string, []int), I thought maybe there was no way in the actual config to use it. I did find configurations in both OSS and plutonium where it could be used, but of course only OSS currently supports using CSV to set them. Previously, using a CSV list was probably the only only way to override them, because growing a slice with indexed environment variables was not supported.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The slice handling is greatly improved in the latest commit. Unindexed env vars with CSV data will now unconditionally overwrite slices of leaf types (not struct types). It is also now an error to use indexed and unindexed env vars with slices of leaf types, since that leads to surprising and unintuitive behavior.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread toml/toml_test.go Outdated
Comment thread toml/toml_test.go Outdated
Comment thread toml/toml_test.go Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread toml/toml.go
Comment thread toml/toml_test.go Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread toml/toml.go Outdated
Comment thread toml/toml.go Outdated
Comment thread toml/toml.go Outdated
Comment thread toml/toml_test.go Outdated
Copy link
Copy Markdown
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would wrap all errors returned as strings, but I'm open to dissent if I'm missing something.

I need to review the tests more.

Comment thread toml/toml.go Outdated
Comment thread toml/toml.go Outdated
Comment thread toml/toml.go Outdated
intValue, err := strconv.ParseInt(value, 0, element.Type().Bits())
if err != nil {
return fmt.Errorf("failed to apply %v to %v using type %v and value '%v': %s", prefix, structKey, element.Type().String(), value, err)
return false, fmt.Errorf("failed to apply %v to %v using type %v and value '%v': %s", prefix, structKey, element.Type().String(), value, err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrap error here too

Comment thread toml/toml.go Outdated
@gwossum

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some []byte handling suggestionss

Comment thread toml/toml.go Outdated
Comment thread toml/toml.go Outdated
Comment thread toml/toml.go Outdated
@gwossum gwossum marked this pull request as ready for review April 7, 2026 19:40
@davidby-influx davidby-influx requested a review from Copilot April 14, 2026 22:17
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread services/udp/service.go
Comment on lines 107 to 113
if s.config.ReadBuffer != 0 {
err = s.conn.SetReadBuffer(s.config.ReadBuffer)
rbs := int(s.config.ReadBuffer)
err = s.conn.SetReadBuffer(rbs)
if err != nil {
s.Logger.Info("Failed to set UDP read buffer",
zap.Int("buffer_size", s.config.ReadBuffer), zap.Error(err))
zap.Int("buffer_size", rbs), zap.Error(err))
return err
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadBuffer is now a signed 64-bit size (toml.SSize) but is cast directly to int before calling SetReadBuffer. On 32-bit platforms or for large values this can overflow/truncate, and negative values (e.g. "-1") will attempt to set a negative buffer size and likely fail service startup. Consider validating ReadBuffer > 0 and ReadBuffer <= math.MaxInt before casting, and skipping/returning a clear error otherwise.

Copilot uses AI. Check for mistakes.
Comment on lines 176 to 181
if s.Config.ReadBuffer != 0 {
err = conn.SetReadBuffer(s.Config.ReadBuffer)
rbs := int(s.Config.ReadBuffer)
err = conn.SetReadBuffer(rbs)
if err != nil {
return fmt.Errorf("unable to set UDP read buffer to %d: %s",
s.Config.ReadBuffer, err)
return fmt.Errorf("unable to set UDP read buffer to %d: %s", rbs, err)
}
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadBuffer is now toml.SSize but is cast directly to int before calling SetReadBuffer. Large values can overflow/truncate on 32-bit platforms, and negative values will try to set a negative buffer size and likely fail startup. Consider requiring ReadBuffer > 0 and ReadBuffer <= math.MaxInt (or returning a clear error) before casting.

Copilot uses AI. Check for mistakes.
Comment on lines 371 to 376
memoryLimit := int64(0)
if s.conf.TotalBufferBytes != 0 {
memoryLimit = int64(s.conf.TotalBufferBytes / len(s.subs))
memoryLimit = int64(s.conf.TotalBufferBytes) / int64(len(s.subs))
if memoryLimit == 0 {
memoryLimit = 1
}
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TotalBufferBytes is now toml.SSize, but the logic treats any non-zero (including negative) value as enabling the per-subscription memory limit. A negative TotalBufferBytes will produce a negative memoryLimit and get passed to router.Update. Consider changing the check to > 0 (and/or validating TotalBufferBytes is non-negative) before dividing.

Copilot uses AI. Check for mistakes.
Comment thread toml/toml_test.go Outdated
Comment on lines +1616 to +1618
// Verify correct when result when there is a false slice grow
// from a default enviroment variable, but no realy one from an
// indexed environemtn variable.
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling/grammar in the test comment: "enviroment", "realy", and "environemtn" are typos and the sentences are hard to parse. Please correct the comment text so future readers can understand the intent quickly.

Suggested change
// Verify correct when result when there is a false slice grow
// from a default enviroment variable, but no realy one from an
// indexed environemtn variable.
// Verify the result is correct when a default environment variable
// suggests slice growth, but there is no indexed environment variable
// to create an actual slice element.

Copilot uses AI. Check for mistakes.
Comment thread toml/toml.go Outdated
Comment on lines +418 to +419
"%s: slice element type %s must implement toml.Defaulter (add ApplyDefaults() method on *%s)",
path, elem, elem))
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VerifyDefaulters’ violation message always suggests adding ApplyDefaults() on *<elem>, but for pointer-element slices (e.g. []*T) that renders as **T and is incorrect/misleading. Consider formatting the suggestion based on the underlying struct type (e.g. suggest *T whether the slice is []T or []*T).

Suggested change
"%s: slice element type %s must implement toml.Defaulter (add ApplyDefaults() method on *%s)",
path, elem, elem))
"%s: slice element type %s must implement toml.Defaulter (add ApplyDefaults() method on %s)",
path, elem, ptrType))

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AI suggestions on casting checks, and a few nits and questions from me.

Continued nice work!

Comment thread cmd/influxd/run/command.go Outdated
log.Info("applied env var overrides",
zap.String("prefix", EnvVarPrefix),
zap.Strings("vars", appliedEnvVars))
log.Info("unmatched env vars in namespace; may be misspelled, unsupported by env var overrides, or removed in this version",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Info or Warn or doesn't matter? Very minor, not a blocker.

Comment thread cmd/influxd/run/command.go Outdated
//
// Both lists are emitted unconditionally (even when empty) so operators can
// confirm the diagnostic ran. The unmatched list is computed against
// os.Environ() so it reflects the real process environment regardless of
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that what we want? Why not respect the local GetEnv if there is one? It would seem to make testing easier to do so.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use cmd.Getenv because we need all environment variables, but there could be symmetrical cmd.Environ. I'll make the change.

Comment thread services/collectd/service.go Outdated

if s.Config.ReadBuffer != 0 {
err = conn.SetReadBuffer(s.Config.ReadBuffer)
rbs := int(s.Config.ReadBuffer)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Casting impedance

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to make the conversion because SetReadBuffer takes an int, while the configuration is a toml.SSize (uint64). I've added range checking in an upcoming commit. The alternative is to either create a toml.ISize type, or not allow size suffixes for the configuration.

Comment thread toml/toml.go
return nil, 0, fmt.Errorf("%w: %c (expected k, m, or g)", ErrSizeBadSuffix, suffix)
}
numText = text[:len(text)-1]
numText = text[:len(text)-suffixSize]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Comment thread toml/toml.go Outdated
seen := make(map[string]struct{})
var unmatched []string
for _, entry := range environ {
eq := strings.IndexByte(entry, '=')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like all the string handling should be runes, but I can't see how to do it efficiently with the runes package. So I guess this is just a lament, not a requested change.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For low-ASCII characters, strings.IndexByte is safe. However, I've changed it to strings.Cut in an upcoming push.

Comment thread toml/toml.go Outdated
var violations []string
seen := make(map[reflect.Type]bool)
var walk func(t reflect.Type, path string)
walk = func(t reflect.Type, path string) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be cleaner as a separate function, not a lambda? Not a blocker.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taking in violations and returning it....

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted and simplified. Also leveraged the cycle breaking code to be recursive type checking code, since recursive types (e.g. linked lists) can not be expressed in toml and therefore are not valid for config data types. Will be in next push.

Comment thread toml/toml_test.go
// rather than the last byte, so the error message shows the actual
// character instead of a UTF-8 continuation byte.
{"bad_suffix_multibyte", "1√", itoml.ErrSizeBadSuffix,
"unknown size suffix: √ (expected k, m, or g)"},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

gwossum added 5 commits April 15, 2026 17:20
- Environment overrides with group names and group IDs now work
  correctly.
- Slices can be extended using environment variables. Previously they
  could only overwrite existing elements.
- Environment variables properly override non-string slices in all
  cases.
- Prevent possible nil dereference.
- Better errors for size parsing errors.
- Improved error messages when a TextUnmarshaler fails.
- Trim whitespace from environment variable values.
- Greatly improved test coverage.
Prevent unbounded slice growth via environment variables.
Also add additional test cases.
- Apply default environment overrides using the unindexed environment
variable when growing a slice.
- Refactor to remove code duplication and remove hasEnvForType.
Remove unnecessary hasEnvValue check before applying environment
overrides to struct fields.
gwossum added 28 commits April 15, 2026 17:21
- Add diagnostic log messages when configuration is loaded at startup or
  reloaded on SIGHUP:
  - List of environment variables used for overrides
  - List of environment variables which have the appropriate prefix but
    were not used for some reason
Make the toml and environment variable parsing explicitly rune-safe.
This fixes an issue where the error message when a size suffix is not
low ASCII has the incorrect character in it. Other places are made
explicitly rune-safe for maintainability, but do not fix any issue.
Add comments and test cases for parsing of hex, octal, and binary
environment variable parsing.
Use %q in error messages and logs instead of %v and %s for TOML and
environment variable parsing.
Trim whitespace from group names and numeric IDs when parsing TOML or
environment variables. Add related tests.
Improve error message for negative toml.Size. Adjust tests as necessary.
Add test specifically for `[data] max-index-log-file-size` for #22974.
Modernize TestConfig_Parse_EnvOverride.
Update Config structs to make use of toml.SSize where appropriate.
Updated fields:
- [collectd] read-buffer
- [data] tar-stream-buffer-size
- [flux-controller] query-initial-memory-bytes
- [flux-controller] query-max-memory-bytes
- [flux-controller] total-max-memory-bytes
- [http] max-body-size
- [subscriber] total-buffer-bytes
- [udp] read-buffer
- Prevent unsafe type conversions of toml.Size and toml.SSize by
  introducing methods that check ranges when doing type conversions.
- Make use of these methods either at time of use or configuration
  validation.
- Change "%s" to "%w" in some `fmt.Errorf` calls
Add `Command.Environ` for symmetry with `Command.Getenv`.
Simplify finding environment variable names from `os.Environ` style list
with `strings.Cut`.
- Change toml.VerifyDefaulters to toml.VerifyConfigType
- Add checks to toml.VerifyConfigType to detect recursive data types,
  which can not be expressed with toml.
- Break walk lambda out of VerifyConfigType into a top-level function,
  walkForConfigType.
- Make use of package level reflect values
- Fix typo in error message
- Update tests for updated error message
Add comment explaining why slice growth through environment
variables is bounded.
ApplyEnvOverrides will now return an error if a slice element
type that needs ApplyDefaults() does not implement it.

This error should not occur in production due to unit tests
verifying that config types implement ApplyDefaults() as needed,
in addition to compile-time type constraint checks.

Update tests as needed.
Improved consistency of unindexed environment variable override
handling. Previously, unindexed environment variables could set slices
using comma-delimited value lists, but only if the underlying slice was
empty. This was unintuitive and potentially impossible for the user to
know if the value was already set or not. Now the unindxed environment
variables overwrite the existing slice with the comma-delimited values.

Additionally, now only slices of leaf types (not structs) support
setting via an unindexed comma-delimited value list. This did not make
sense and could not set structs, so the change should not be visible to
users.

Finally, it is now an error to use indexed and unindexed environment
variables for slices of leaf types. The previous behavior was surprising
and unintuitive. It is highly unlikely users would using this as a
feature, but it should be documented for users that do encounter it.
@gwossum gwossum force-pushed the gw/improved_env_overrides branch from 9e72fd5 to 19d259e Compare April 16, 2026 21:06
Fix issue with growing a slice of pointers to a scalar using unindexed
CSV environment variables.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants