Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .github/CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,4 +89,3 @@ This Code of Conduct is adapted from the Contributor Covenant, version 3.0, perm
Contributor Covenant is stewarded by the Organization for Ethical Source and licensed under CC BY-SA 4.0. To view a copy of this license, visit [https://creativecommons.org/licenses/by-sa/4.0/](https://creativecommons.org/licenses/by-sa/4.0/)

For answers to common questions about Contributor Covenant, see the FAQ at [https://www.contributor-covenant.org/faq](https://www.contributor-covenant.org/faq). Translations are provided at [https://www.contributor-covenant.org/translations](https://www.contributor-covenant.org/translations). Additional enforcement and community guideline resources can be found at [https://www.contributor-covenant.org/resources](https://www.contributor-covenant.org/resources). The enforcement ladder was inspired by the work of [Mozilla’s code of conduct team](https://github.com/mozilla/inclusion).

2 changes: 1 addition & 1 deletion .github/workflows/build_publish_docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -72,4 +72,4 @@ jobs:
labels: ${{ steps.meta.outputs.labels }}
file: ./docker/dockerfiles/Dockerfile.${{ matrix.container }}
cache-from: type=gha
cache-to: type=gha,mode=max
cache-to: type=gha,mode=max
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -335,4 +335,4 @@ docs/api/
!vcpkg.json
!**/vcpkg.json
!compile_commands.json
vcpkg_installed
vcpkg_installed
2 changes: 1 addition & 1 deletion .readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@ python:
- requirements: requirements/requirements.logserver.txt

sphinx:
configuration: docs/conf.py
configuration: docs/conf.py
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ values in detail.

### Testing Your Own Data

If you want to ingest data to the pipeline, you can do so via the zeek container. Either select the interface in the `config.yaml` zeek should be listening on and set `static_analysis: false` or provide PCAPs to Zeek by adding them in the `data/test_pcaps` directory, which is mounted per default for Zeek to ingest static data.
If you want to ingest data to the pipeline, you can do so via the zeek container. Either select the interface in the `config.yaml` zeek should be listening on and set `static_analysis: false` or provide PCAPs to Zeek by adding them in the `data/test_pcaps` directory, which is mounted per default for Zeek to ingest static data.

### Monitoring
To monitor the system and observe its real-time behavior, multiple Grafana dashboards have been set up.
Expand Down
2 changes: 1 addition & 1 deletion assets/heidgaf_architecture.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ pipeline:
log_storage:
logserver:
input_file: "/opt/file.txt"




Expand Down
2 changes: 1 addition & 1 deletion docs/pipeline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -840,4 +840,4 @@ Domainator Detector
The :class:`DomainatorDetector` consumes anomalous batches of requests.
It identifies potential data exfiltration and command & control on the subdomain level by analyzing characteristics of the subdomains.
Messages are grouped by domain into fixed-size windows to allow for sequential anomaly detection. The detector leverages machine learning based on statistical and linguistic features from the domain name
including label lengths, character frequencies, entropy measures, and counts of different character types across domain name levels.
including label lengths, character frequencies, entropy measures, and counts of different character types across domain name levels.
2 changes: 1 addition & 1 deletion requirements/requirements.detector.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ confluent-kafka~=2.4.0
marshmallow_dataclass~=8.7.1
clickhouse_connect~=0.8.3
pylcs
Levenshtein
Levenshtein
2 changes: 1 addition & 1 deletion requirements/requirements.train.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ seaborn
lightgbm
imblearn
pylcs
Levenshtein
Levenshtein
46 changes: 27 additions & 19 deletions src/alerter/alerter.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ class AlerterAbstractBase(ABC):
"""
Abstract base class for all alerter implementations.
"""

@abstractmethod
def __init__(self, alerter_config, consume_topic) -> None:
pass
Expand All @@ -51,6 +52,7 @@ class AlerterBase(AlerterAbstractBase):
executing custom processing via plugins, and performing base actions
like logging to a file or forwarding to an external Kafka topic.
"""

def __init__(self, alerter_config, consume_topic) -> None:
self.name = alerter_config.get("name", "generic")
self.consume_topic = consume_topic
Expand All @@ -59,25 +61,28 @@ def __init__(self, alerter_config, consume_topic) -> None:
self.key = None

self.kafka_consume_handler = ExactlyOnceKafkaConsumeHandler(self.consume_topic)

# Base actions config
self.log_to_file = ALERTING_CONFIG.get("log_to_file", False)
self.log_file_path = ALERTING_CONFIG.get("log_file_path", "/opt/logs/alerts.txt")
self.log_file_path = ALERTING_CONFIG.get(
"log_file_path", "/opt/logs/alerts.txt"
)
self.log_to_kafka = ALERTING_CONFIG.get("log_to_kafka", False)
self.external_kafka_topic = ALERTING_CONFIG.get("external_kafka_topic", "external_alerts_topic")
self.external_kafka_topic = ALERTING_CONFIG.get(
"external_kafka_topic", "external_alerts_topic"
)

if self.log_to_file:
ensure_directory(self.log_file_path)

if self.log_to_kafka:
self._setup_kafka_output_topics()


def _setup_kafka_output_topics(self):
"""
Ensure that the external Kafka topic exists.
Since no internal consumer subscribes to this topic, auto-creation
Ensure that the external Kafka topic exists.

Since no internal consumer subscribes to this topic, auto-creation
via consumer polling won't happen. We use AdminClient to ensure
the topic exists before producing to it.
"""
Expand All @@ -92,10 +97,12 @@ def _setup_kafka_output_topics(self):
try:
admin_client.create_topics([NewTopic(self.external_kafka_topic, 1, 1)])
except Exception as e:
logger.warning(f"Could not auto-create topic {self.external_kafka_topic}: {e}")

logger.warning(
f"Could not auto-create topic {self.external_kafka_topic}: {e}"
)

self.kafka_produce_handler = ExactlyOnceKafkaProduceHandler()

def get_and_fill_data(self) -> None:
if self.alert_data:
logger.warning(
Expand All @@ -121,7 +128,7 @@ def _log_to_file_action(self):
"""
if not self.log_to_file:
return

logger.info(f"{self.name}: Logging alert to file {self.log_file_path}")
try:
with open(self.log_file_path, "a+") as f:
Expand All @@ -138,7 +145,9 @@ def _log_to_kafka_action(self):
if not self.log_to_kafka:
return

logger.info(f"{self.name}: Forwarding alert to topic {self.external_kafka_topic}")
logger.info(
f"{self.name}: Forwarding alert to topic {self.external_kafka_topic}"
)
try:
self.kafka_produce_handler.produce(
topic=self.external_kafka_topic,
Expand All @@ -151,7 +160,7 @@ def _log_to_kafka_action(self):

def bootstrap_alerter_instance(self):
"""
Main loop for the alerter instance.
Main loop for the alerter instance.
Consumes alerts, processes them, and executes base actions.
"""
logger.info(f"Starting {self.name} Alerter")
Expand Down Expand Up @@ -185,18 +194,17 @@ async def start(self):
await loop.run_in_executor(None, self.bootstrap_alerter_instance)



async def main():
tasks = []

# Setup Generic Alerter Task
generic_topic = f"{CONSUME_TOPIC_PREFIX}-generic"
logger.info("Initializing Generic Alerter")
logger.info("Initializing Generic Alerter")
class_name = "GenericAlerter"
mod_name = f"{PLUGIN_PATH}.generic_alerter"
module = importlib.import_module(mod_name)
AlerterClass = getattr(module, class_name)

generic_alerter = AlerterClass(
alerter_config={"name": "generic"}, consume_topic=generic_topic
)
Expand All @@ -211,12 +219,12 @@ async def main():
mod_name = f"{PLUGIN_PATH}.{alerter_config['alerter_module_name']}"
module = importlib.import_module(mod_name)
AlerterClass = getattr(module, class_name)

alerter_instance = AlerterClass(
alerter_config=alerter_config, consume_topic=consume_topic
)
tasks.append(asyncio.create_task(alerter_instance.start()))

await asyncio.gather(*tasks)


Expand Down
8 changes: 4 additions & 4 deletions src/alerter/plugins/generic_alerter.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,15 @@

class GenericAlerter(AlerterBase):
"""
Specific implementation for an Alerter that processes alerts
from a generic topic.
Specific implementation for an Alerter that processes alerts
from a generic topic.
It performs no additional processing or transformation by itself,
instead relying solely on the base actions (logging to file/Kafka).
"""

def process_alert(self):
"""
Generic implementation: no special processing needed.
"""
pass

2 changes: 1 addition & 1 deletion src/base/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -233,4 +233,4 @@ def generate_collisions_resistant_uuid():
def ensure_directory(file_path):
directory = os.path.dirname(file_path)
if directory:
os.makedirs(directory, exist_ok=True)
os.makedirs(directory, exist_ok=True)
19 changes: 13 additions & 6 deletions src/detector/detector.py
Original file line number Diff line number Diff line change
Expand Up @@ -317,7 +317,9 @@ def detect(self) -> None:
y_pred = self.predict(message)
logger.info(f"Prediction: {y_pred}")
# TODO: DO NOT USE if TRUE for prod!!!
if True: # np.argmax(y_pred, axis=1) == 1 and y_pred[0][1] > self.threshold:
if (
True
): # np.argmax(y_pred, axis=1) == 1 and y_pred[0][1] > self.threshold:
logger.info("Append malicious request to warning.")
warning = {
"request": message,
Expand Down Expand Up @@ -361,11 +363,11 @@ def send_warning(self) -> None:
"src_ip": self.key,
"alert_timestamp": datetime.datetime.now().isoformat(),
"suspicious_batch_id": str(self.suspicious_batch_id),
"detector_name": self.name
"detector_name": self.name,
}

logger.info(f"Producing alert to Kafka: {alert}")

for topic in self.produce_topics:
self.kafka_produce_handler.produce(
topic=topic,
Expand Down Expand Up @@ -526,13 +528,16 @@ async def main(): # pragma: no cover
"""
# ensure all detectors configure what to do
# instead of doing ensure alert directly we now use alerter topics

tasks = []
for detector_config in DETECTORS:
consume_topic = f"{CONSUME_TOPIC_PREFIX}-{detector_config['name']}"
produce_topics_str = detector_config.get("produce_topics", "")
if produce_topics_str:
produce_topics = [f"{PRODUCE_TOPIC_PREFIX}-{t.strip()}" for t in produce_topics_str.split(",")]
produce_topics = [
f"{PRODUCE_TOPIC_PREFIX}-{t.strip()}"
for t in produce_topics_str.split(",")
]
else:
produce_topics = [f"{PRODUCE_TOPIC_PREFIX}-generic"]

Expand All @@ -541,7 +546,9 @@ async def main(): # pragma: no cover
module = importlib.import_module(module_name)
DetectorClass = getattr(module, class_name)
detector = DetectorClass(
detector_config=detector_config, consume_topic=consume_topic, produce_topics=produce_topics
detector_config=detector_config,
consume_topic=consume_topic,
produce_topics=produce_topics,
)
tasks.append(asyncio.create_task(detector.start()))
await asyncio.gather(*tasks)
Expand Down
Loading