There is an error on ARM machine (Apple's M1) in k8s cluster when NSM's forwarder-vpp is trying to start.
I have tried to start it with and without Rosetta - in both cases it gives the same error.
Here's a log from the forwarder.
Jul 15 15:11:30.600�[36m [INFO] [cmd:/bin/forwarder] �[0mSetting env variable DLV_LISTEN_FORWARDER to a valid dlv '--listen' value will cause the dlv debugger to execute this binary and listen as directed.
Jul 15 15:11:30.602�[36m [INFO] [cmd:/bin/forwarder] �[0mthere are 9 phases which will be executed followed by a success message:
Jul 15 15:11:30.602�[36m [INFO] [cmd:/bin/forwarder] �[0mthe phases include:
Jul 15 15:11:30.602�[36m [INFO] [cmd:/bin/forwarder] �[0m1: get config from environment
Jul 15 15:11:30.602�[36m [INFO] [cmd:/bin/forwarder] �[0m2: run vpp and get a connection to it
Jul 15 15:11:30.602�[36m [INFO] [cmd:/bin/forwarder] �[0m3: get SR-IOV config from file
Jul 15 15:11:30.602�[36m [INFO] [cmd:/bin/forwarder] �[0m4: init pools
Jul 15 15:11:30.603�[36m [INFO] [cmd:/bin/forwarder] �[0m5: start device plugin server
Jul 15 15:11:30.603�[36m [INFO] [cmd:/bin/forwarder] �[0m6: retrieve spiffe svid
Jul 15 15:11:30.603�[36m [INFO] [cmd:/bin/forwarder] �[0m7: create xconnect network service endpoint
Jul 15 15:11:30.603�[36m [INFO] [cmd:/bin/forwarder] �[0m8: create grpc server and register xconnect
Jul 15 15:11:30.603�[36m [INFO] [cmd:/bin/forwarder] �[0m9: register xconnectns with the registry
Jul 15 15:11:30.603�[36m [INFO] [cmd:/bin/forwarder] �[0ma final success message with start time duration
Jul 15 15:11:30.603�[36m [INFO] [cmd:/bin/forwarder] �[0mexecuting phase 1: get config from environment (time since start: 805.75µs)
This application is configured via the environment. The following environment
variables can be used:
KEY TYPE DEFAULT REQUIRED DESCRIPTION
NSM_NAME String forwarder Name of Endpoint
NSM_LABELS Comma-separated list of String:String pairs p2p:true Labels related to this forwarder-vpp instance
NSM_NSNAME String forwarder Name of Network Service to Register with Registry
NSM_CONNECT_TO URL unix:///connect.to.socket url to connect to
NSM_LISTEN_ON URL unix:///listen.on.socket url to listen on
NSM_MAX_TOKEN_LIFETIME Duration 10m maximum lifetime of tokens
NSM_LOG_LEVEL String INFO Log level
NSM_DIAL_TIMEOUT Duration 100ms Timeout for the dial the next endpoint
NSM_OPENTELEMETRYENDPOINT String otel-collector.observability.svc.cluster.local:4317 OpenTelemetry Collector Endpoint
NSM_TUNNEL_IP String IP to use for tunnels
NSM_VXLAN_PORT Unsigned Integer 0 VXLAN port to use
NSM_VPP_API_SOCKET String /var/run/vpp/external/vpp-api.sock filename of socket to connect to existing VPP instance. If empty a VPP instance is run in forwarder
NSM_VPP_INIT Func NONE type of VPP initialization. Must be NONE or AF_PACKET
NSM_RESOURCE_POLL_TIMEOUT Duration 30s device plugin polling timeout
NSM_DEVICE_PLUGIN_PATH String /var/lib/kubelet/device-plugins/ path to the device plugin directory
NSM_POD_RESOURCES_PATH String /var/lib/kubelet/pod-resources/ path to the pod resources directory
NSM_DEVICE_SELECTOR_FILE String config file for device name to label matching
NSM_SRIOV_CONFIG_FILE String PCI resources config path
NSM_PCI_DEVICES_PATH String /sys/bus/pci/devices path to the PCI devices directory
NSM_PCI_DRIVERS_PATH String /sys/bus/pci/drivers path to the PCI drivers directory
NSM_CGROUP_PATH String /host/sys/fs/cgroup/devices path to the host cgroup directory
NSM_VFIO_PATH String /host/dev/vfio path to the host VFIO directory
Jul 15 15:11:30.621�[36m [INFO] [cmd:/bin/forwarder] �[0mConfig: &config.Config{Name:"forwarder-vpp-9mgxg", Labels:map[string]string{"p2p":"true"}, NSName:"forwarder", ConnectTo:url.URL{Scheme:"unix", Opaque:"", User:(*url.Userinfo)(nil), Host:"", Path:"/var/lib/networkservicemesh/nsm.io.sock", RawPath:"", ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}, ListenOn:url.URL{Scheme:"unix", Opaque:"", User:(*url.Userinfo)(nil), Host:"", Path:"/listen.on.sock", RawPath:"", ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}, MaxTokenLifetime:600000000000, LogLevel:"TRACE", DialTimeout:100000000, OpenTelemetryEndpoint:"otel-collector.observability.svc.cluster.local:4317", TunnelIP:net.IP{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0xac, 0x13, 0x0, 0x3}, VxlanPort:0x0, VppAPISocket:"/var/run/vpp/external/vpp-api.sock", VppInit:vppinit.Func{f:(func(context.Context, api.Connection, net.IP) (net.IP, error))(0xc875e0)}, ResourcePollTimeout:30000000000, DevicePluginPath:"/var/lib/kubelet/device-plugins/", PodResourcesPath:"/var/lib/kubelet/pod-resources/", DeviceSelectorFile:"", SRIOVConfigFile:"", PCIDevicesPath:"/sys/bus/pci/devices", PCIDriversPath:"/sys/bus/pci/drivers", CgroupPath:"/host/sys/fs/cgroup/devices", VFIOPath:"/host/dev/vfio"}
Jul 15 15:11:30.622�[36m [INFO] [cmd:/bin/forwarder] [duration:18.244709ms] �[0mcompleted phase 1: get config from environment
Jul 15 15:11:30.622�[36m [INFO] [cmd:/bin/forwarder] �[0mexecuting phase 2: run vpp and get a connection to it (time since start: 19.806667ms)
Jul 15 15:11:30.623�[36m [INFO] �[0mConfiguration file: "/etc/vpp/helper/vpp.conf" not found, using defaults
Jul 15 15:11:30.633�[36m [INFO] [cmd:/bin/forwarder] �[0mlocal vpp is being used
Jul 15 15:11:30.634�[36m [INFO] [cmd:/bin/forwarder] [duration:11.589958ms] �[0mcompleted phase 2: run vpp and get a connection to it
Jul 15 15:11:30.634�[33m [WARN] [cmd:/bin/forwarder] �[0mskipping phases 3-5: no PCI resources config
Jul 15 15:11:30.634�[33m [WARN] [cmd:/bin/forwarder] �[0mSR-IOV is not enabled
Jul 15 15:11:30.634�[36m [INFO] [cmd:/bin/forwarder] �[0mexecuting phase 6: retrieving svid, check spire agent logs if this is the last line you see (time since start: 32.209ms)
Jul 15 15:11:30.898�[36m [INFO] �[0mSVID: "spiffe://example.org/ns/nsm-system/pod/forwarder-vpp-9mgxg"
Jul 15 15:11:30.906�[36m [INFO] [cmd:/bin/forwarder] [duration:271.382458ms] �[0mcompleted phase 6: retrieving svid
Jul 15 15:11:30.906�[36m [INFO] [cmd:/bin/forwarder] �[0mexecuting phase 7: create xconnect network service endpoint (time since start: 304.018959ms)
Jul 15 15:11:30.624�[36m [INFO] [cmd:vpp] �[0mvpp[3500]: clib_sysfs_prealloc_hugepages:262: pre-allocating 64 additional 2048K hugepages on numa node 0
Jul 15 15:11:30.624�[36m [INFO] [cmd:vpp] �[0mvpp[3500]: buffer: numa[0] falling back to non-hugepage backed buffer pool (vlib_physmem_shared_map_create: pmalloc_map_pages: failed to mmap 64 pages at 0x1000000000 fd 5 numa 0 flags 0x11: Invalid argument)
Jul 15 15:11:33.219�[37m [DEBU] �[0m/var/run/vpp/api.sock was created after 2.587428293s
Jul 15 15:11:33.330�[37m [DEBU] �[0msuccessfully connected to /var/run/vpp/api.sock after 110.855416ms and 1 attempts
panic: error: VPPApiError: System call error #1 (-11)
goroutine 1 [running]:
github.com/networkservicemesh/cmd-forwarder-vpp/internal/vppinit.Must(...)
/build/internal/vppinit/vppinit.go:68
main.main()
/build/main.go:239 +0x2f45
There is an error on ARM machine (Apple's M1) in k8s cluster when NSM's forwarder-vpp is trying to start.
I have tried to start it with and without Rosetta - in both cases it gives the same error.
Here's a log from the forwarder.
And the same log in file:
forwarder-vpp-9mgxg.log