Erlang SNMP Profiler

I observed some strang behavior with Erlang's SNMP in which SNMP response packets would arrive at a server's network interface fairly promptly, but there was a big delay between the packet's arrival and the time the Erlang function returned the data. This is a small program to give some profiling data about SNMP performance. Run it from various locations and pointing at various targets to compare performance. We hope this will help narrow down the performance problem.

Dev mode

This is meant to run in an Erlang Docker container for easy portability. Start one from this project dir with

docker run --rm -it -v $(pwd):/app -w /app -p 21312:21312 \
    erlang:18.3.4.11 bash

Three shell scripts give you basic functionality:

compile.sh compiles the project
run.sh is the main runner for both dev and real usage; compile first
clean.sh cleans compiled artifacts; could also just git clean -fdx

Distribution

The project is built and archived to a tar.gz file that's available on the Github Releases page. At its current state, you need both docker and docker-compose to run it. Then:

Download and extract the archive.
Change to the extracted directory, like cd snmp-profiler-X.X.X.
Run docker-compose up.
Open a new terminal, and change to the extracted directory again.
Run docker ps, and find the name or id of the running erlang container.
Run docker exec -it -w /app $CONTAINER_ID bash.
Run the project with ./run.sh [OPTIONS].
View the output in Grafana, which is running on port 3000. For security, it's bound only to the local interface, so you'll need to make some kind of proxy/tunnel. I recommend sshuttle, such as
```
sshuttle -r $ADDRESS $ADDRESS
```
where $ADDRESS is the server running the tool. Then browse to http://$ADDRESS:3000. Otherwise, you can use plain ssh to establish a tunnel, like
```
ssh -L 1234:localhost:3000 $ADDRESS
```
and then browse to http://localhost:1234.

Known issues/TODO

Build in docker-compose so the user doesn't have to go get it themselves.
Make running the project simpler overall. There should only be three steps: download, extract, and execute.
Figure out what other metrics make sense to track. Should metric names include the switch name? Like instead of snmp_profiler.sync_get_next, it could be snmp_profiler.ord1.a3-1-1.sync_get_next.
Verify that all emitted metrics are making it to Graphite/Grafana. I've seen some discrepancies that make me think they're not. Try checking a count of emitted metrics against some of the counts in Grafana.

The current approach of mounting directories into containers can cause issues on hardened servers:

docker-compose seems to use /tmp for some intermediate script(s), and if /tmp is mounted with noexec, it'll fail with an error like this:
```
./docker-compose: error while loading shared libraries: libz.so.1: failed to map segment from shared object: Operation not permitted
```
To solve this, you can create a "temporary temporary" directory in the project directory, like mkdir tmp and then export TMPDIR=tmp to tell docker-compose to use that instead.
The Grafana mounted files at grafana/mounts have to be readable by the grafana user in the Grafana container. If the file permissions are too restrictive when you extract the archive, Grafana will fail to start with an error like
```
CRIT[10-08|14:50:43] failed to parse "/etc/grafana/grafana.ini": open /etc/grafana/grafana.ini: permission denied
```
To solve this, you can run chmod a+rX grafana.

Design

The goal is to test Erlang snmpm:sync_get_next() calls. In troubleshooting, I've observed that this function sends a request packet, and the response packet is received in very short order according to a tcpdump of the interface; however there's a long delay before the erlang function returns the data. This project will focus on observing the behavior of that function.

It allows you to test the performance of a single device or a group of them defined by an input file. For each test run, it produces a report with useful data for comparing against other runs. It will be run by non-developers, so its interface needs to be relatively clean and easy to use.

The run.sh script does initial input arg parsing and passes all args through to the erlang program. A dedicated config module further interprets, validates, and stores all the input arguments so they're sanitized and available to the rest of the application. For sanity, the arg names are the same at the shell script level as inside the erlang program, except that in some places, hyphens have to become underscores because of language constraints.

A primitive logging mechanism lets users choose between three levels of verbosity via command line flags. All logs are written to stdout.

The program should abort with a good error message and non-zero exit status for any recognizable error. There's a die() function for this purpose. Be sure to tag internal failures as such. An example of an internal failure would be asking the config module for a config item that doesn't exist.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
grafana/mounts/etc/grafana		grafana/mounts/etc/grafana
snmp_work_dir		snmp_work_dir
src		src
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
README.md		README.md
clean.sh		clean.sh
compile.sh		compile.sh
docker-compose.yml		docker-compose.yml
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Erlang SNMP Profiler

Dev mode

Distribution

Known issues/TODO

Design

About

Uh oh!

Releases 4

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Erlang SNMP Profiler

Dev mode

Distribution

Known issues/TODO

Design

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 4

Uh oh!

Contributors

Uh oh!

Languages