Highly available syslog collector? #11548

pv2b · 2026-03-12T21:21:49Z

pv2b
Mar 12, 2026

Hello!

I am currently designing a logging pipeline for our organization. We need to ingest a variety of log inputs - including syslog, and we need to consider how to make the logging pipeline as robust as possible against loss of logs due to outages within the logging pipeline.

The general architecture I'm considering looks something like this:

graph TD
    subgraph server [Application server]
        server_logfiles[(Log files)] -->|tail| server_fb
        eventlog[(Event log)] -->|winlog| server_fb
        server_fb[fluentbit]
    end
    subgraph syslog_cluster["Syslog Receiver Cluster"]
        syslog-collector-vip((Syslog VIP))
        syslog_active[Active Syslog Receiver]
        syslog_standby[Standby Syslog Receiver]
    end
    syslog-collector-vip -->|syslog| syslog_active -->|forward| logrouter
    syslog-collector-vip .->|syslog| syslog_standby -->|forward| logrouter
    syslog-sender[Network device or appliance]
    server_fb -->|forward| logrouter
    syslog-sender -->|syslog| syslog-collector-vip
    
    logrouter[Log router / concentrator]
    logrouter -->|S3| s3[(S3 Bucket with Object Lock)]
    logrouter -->|OpenSearch| opensearch[(OpenSearch cluster)]

(Incidentally, the reason for the dual write to S3 and OpenSearch is to be able to fulfill two requirements that are not easilly combined into one system, having the logs stored immutably for forensic reasons, and also being able to query them efficiently for day-to-day operations. But that's not what's important right now.)

The idea is that, for systems where this is possible, we should run fluentbit directly on those systems in order to collect log data in a distributed fashion. But this won't be possible for every system out there - some systems will only be able to send syslog.

There are many advantages to running Fluentbit directly on each managed host, the main one being that it allows for distributed buffering and backpressure in case the central log router / concentrator experiences a spike or is down for a few minutes for maintenance. It's not a big deal, the log files will still be there when the log router is back up and running again.

But that's simply not the case with syslog. Syslog is akin to your network devices screaming into the void, throwing a UDP packet in the general direction of a syslog server and hoping a process is there to receive it. No matter what you do, some event loss will happen. I understand that, I just want to minimize how much loss there is. That's why the architecture diagram above calls for syslog receivers specifically to be run on two dedicated servers in an active/passive HA cluster, sharing a virtual IP using VRRP (probably using something like keepalived).

The path most travelled seems to be to set up something like rsyslog or syslog-ng, and then have fluentbit ingest its logs, but that seems to be an extra step and another thing that adds complication and potential lossage, and I'd rather have a simpler approach that is workable. Fluentbit can receive syslog directly, so why not just use that directly?

And that's the question: Has anyone actually deployed fluentbit this way, and have a sustainable approach to monitor the health of fluentbit and its capacity to receive and process syslog messages, in a way that can feed into keepalived so it can poll its health?

And is this even a reasonable approach? Or would I be better off trying to poll logs from my devices with a perl script and a cron job?

patrick-stephens · 2026-03-13T17:05:58Z

patrick-stephens
Mar 13, 2026
Collaborator

I think this seems reasonable to me and we have deployed similar approaches in the past - including the S3 output plus a.n.other for the exact needs you talk about. There's also options to replay from S3 then as well if the less available stack falls over.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Highly available syslog collector? #11548

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Highly available syslog collector? #11548

Uh oh!

Uh oh!

pv2b Mar 12, 2026

Replies: 1 comment

Uh oh!

patrick-stephens Mar 13, 2026 Collaborator

pv2b
Mar 12, 2026

patrick-stephens
Mar 13, 2026
Collaborator