You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I opened a discussion some time ago about my vector config (see #24635). This week we finally went to production and we found out some problems not seen in our test environment.
Namely, it seems we had some issues for the exec source, which is a bash script that just dumps metadata from K8s API server and fuels our enrichment_table. We run the script manually inside vector pods and it run fast with no error (returned 0), so probably that wasn't the root cause.
After more troubleshooting we found out we had a lot of jq stuck processes. Since it was used inside the lua script, we tried removing the entire lua from the transforms section of vector config. Turns out, after that everything worked fine and timeout errors disappeared.
So my reasoning is, our current lua script is not optimized enough with our current load. You can see it in the section below. Vector's lua does not support HTTP external libraries, right? So the only way I have to fetch the application name for what is not present in the enrichment_table is spawning a curl process.
Do you think I have other alternatives if I want to try to remediate a missing app name when the record is not present in the enrichment table? In general this seems to happen to less of 1% of our logs, so it's relatively rare, but on absolute numbers it happens thousands of times.
Thank you in advance.
Note the vector config below only includes what I think is in scope for this question.
Vector Config
enrichment_tables:
k8s_metadata:
type: "memory"
ttl_seconds: 1000
# The table reads events directly from this transform
inputs: ["app_enricher"]
sources:
vector_metrics:
type: internal_metrics
kube:
type: "exec"
command:
- "/etc/vector/scripts/get_k8s_label.sh"
mode: "scheduled"
scheduled:
exec_interval_secs: 30
decoding:
codec: "json"
logs:
type: "file"
include: ["/var/log/containers/*.log"]
read_from: end
ignore_older_secs: 20
offset_key: offset
transforms:
# Prepare data for the enrichment table
app_enricher:
type: "remap"
inputs: ["kube"]
source: |
# Reshape the event so the KEY is the container_id and the VALUE is the app_name.
# The memory table interprets { "key": "value" } as: lookup("key") -> "value"
# We want to store: container_id -> { app_name: "..." }
# So we create a dynamic key using the container_id
id = .container_id
app = .app_name
# Clear the event and set the new structure
. = {}
# Set the dynamic key.
#set!(., [id], app)
. = set!(., [id], { "app_name": app })
router:
type: route
inputs:
- set_app
reroute_unmatched: false
route:
needs_enrichment:
source: "!exists(.app_name)"
type: vrl
ok:
source: "exists(.app_name)"
type: vrl
lua_enricher:
type: lua
inputs:
- router.needs_enrichment
version: "2"
hooks:
process: |-
function (event, emit)
-- Hardcoded target (as in your curl example)
local namespace = event.log.namespace
local pod = event.log.pod_name
local cmd = table.concat({
"curl -s",
"https://endpoint.example.com",
"| jq -c '.metadata.labels.\"app.kubernetes.io/name\"'",
"2>&1",
}, " ")
local handle = io.popen(cmd)
if not handle then
event.log = event.log or {}
event.log.k8s_label_error = "io.popen failed"
emit(event)
return
end
local out = handle:read("*a") or ""
local ok, _, exit_code = handle:close()
-- Trim output
out = out:gsub("^%s*(.-)%s*$", "%1")
event.log = event.log or {}
if ok and out ~= "" and out ~= "null" then
-- out is a JSON string like: "shared-shake"
-- store it as-is or strip quotes:
local unquoted = out:gsub('^"(.*)"$', "%1")
event.log.app_name = unquoted
event.log.k8s_label_source = "curl_apiserver"
event.log.enrichment_source = "table"
else
event.log.k8s_label_error = out
event.log.k8s_label_exit_code = exit_code
end
emit(event)
end
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Question
Hello, I opened a discussion some time ago about my vector config (see #24635). This week we finally went to production and we found out some problems not seen in our test environment.
Namely, it seems we had some issues for the exec source, which is a bash script that just dumps metadata from K8s API server and fuels our enrichment_table. We run the script manually inside vector pods and it run fast with no error (returned 0), so probably that wasn't the root cause.
After more troubleshooting we found out we had a lot of
jqstuck processes. Since it was used inside the lua script, we tried removing the entire lua from the transforms section of vector config. Turns out, after that everything worked fine and timeout errors disappeared.So my reasoning is, our current lua script is not optimized enough with our current load. You can see it in the section below. Vector's lua does not support HTTP external libraries, right? So the only way I have to fetch the application name for what is not present in the enrichment_table is spawning a curl process.
Do you think I have other alternatives if I want to try to remediate a missing app name when the record is not present in the enrichment table? In general this seems to happen to less of 1% of our logs, so it's relatively rare, but on absolute numbers it happens thousands of times.
Thank you in advance.
Note the vector config below only includes what I think is in scope for this question.
Vector Config
Vector Logs
Beta Was this translation helpful? Give feedback.
All reactions