Skip to content

Generated Fast Marshal Panics with Buffer Overflow in EncodeNested for proto3 Fields with Implicit Presence #183

@francoposa

Description

@francoposa

Title

Generated Fast Marshal Panics with Buffer Overflow in EncodeNested for proto3 Fields with Implicit Presence

Version

v0.35.0

Description

EncodeNested comes into play when the messages are nested more than one level deep.

A proto3 message with implicit presence will correctly report its Size as 0 when its members are equal to the default values
However, EncodeNested will still write the tag and size into the buffer, taking up two bytes for each message that should have implicit presence and therefore not be written into the buffer at all.
Those bytes were not accounted for in the initial buffer size allocation by Marshal before it calls into MarshalTo, so if there are any other fields to be serialized in the message, it will panic.

Note that this does not end up being true when dealing with implicit/explicit presence: https://protobuf.dev/reference/go/size/

Checking if proto.Size returns 0 is an easy way to recognize empty messages:

if proto.Size(m) == 0 {
    // No fields set (or, in proto3, all fields matching the default);
    // skip processing this message, or return an error, or similar.
}

Go's reflect-based marshaling itself has extensive code and full internal presence package to track field presence and resize the marshaling buffer if needed.

Repro steps

Full working code @ https://github.com/francoposa/proto-demo/tree/francoposa/prometheus-rw2-csproto-bug-repro

Simplified Prometheus Remote Write V2 Proto

syntax = "proto3";
package io.prometheus.write.v2;

option go_package = "github.com/prometheus/prometheus/write/v2;writev2";

// Request represents a request to write the given timeseries to a remote destination.
message Request {
  reserved 1 to 3;
  repeated string symbols = 4;
  repeated TimeSeries timeseries = 5;
}

// TimeSeries represents a single series.
message TimeSeries {
  repeated uint32 labels_refs = 1 [packed=false];

  repeated Sample samples = 2;
  reserved 3;
  reserved 4;

  Metadata metadata = 5;

  int64 created_timestamp = 6;
}

// Sample represents series sample.
message Sample {
  double value = 1;
  int64 timestamp = 2;
}

// Metadata represents the metadata associated with the given series' samples.
message Metadata {
  enum MetricType {
    METRIC_TYPE_UNSPECIFIED    = 0;
  }
  MetricType type = 1;
  uint32 help_ref = 3;
  uint32 unit_ref = 4;
}

Test to Repro Against the Generated fastmarshal Code

package writev2

import (
	"testing"

	"github.com/CrowdStrike/csproto"
	"github.com/stretchr/testify/assert"
)

func TestMarshalZeroValues(t *testing.T) {
	for _, tt := range []struct {
		name   string
		msg    csproto.Marshaler
		panics bool
	}{
		{
			name: "nonempty-zero-value-single-field-member",
			// does not panic because total size for entire message reports as zero;
			// there is no buffer and no serialization is performed
			msg: &TimeSeries{
				Samples: []*Sample{
					{
						Value:     0,
						Timestamp: 0,
					},
				},
			},
			panics: false,
		},
		{
			name: "nonempty-zero-value-single-nested-member-with-member-before",
			// panics because total size for each Sample message is report as zero
			// so bytes were not accounted for in the initial buffer size calculation,
			// but EncodeNested will still encode the tag and length for each Sample message;
			// buffer overflows when the Sample tag is written to it
			msg: &TimeSeries{
				LabelsRefs: []uint32{0, 1},
				Samples: []*Sample{
					{
						Value:     0,
						Timestamp: 0,
					},
				},
			},
			panics: true,
		},
		{
			name: "nonempty-zero-value-single-nested-member-with-member-after",
			// panics because total size for each Sample message is report as zero
			// so bytes were not accounted for in the initial buffer size calculation,
			// but EncodeNested will still encode the tag and length for each Sample message;
			// buffer overflows when the next (Metadata) field is written to it
			msg: &TimeSeries{
				Samples: []*Sample{
					{
						Value:     0,
						Timestamp: 0,
					},
				},
				Metadata: &Metadata{
					Type: Metadata_METRIC_TYPE_UNSPECIFIED,
				},
			},
			panics: true,
		},
	} {
		t.Run(tt.name, func(t *testing.T) {
			// csproto marshal
			if tt.panics {
				assert.Panics(t, func() {
					_, _ = tt.msg.Marshal()
				})
			} else {
				_, err := tt.msg.Marshal()
				assert.NoError(t, err)
			}
		})
	}
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriageIssues that need to be reviewed by maintainers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions