Skip to content

Regression causing deadlock between File.Close and File.WriteTo #603

@ncw

Description

@ncw

I noticed when I upgraded the dependencies for rclone that the sftp server tests have started deadlocking.

I used git bisect to narrow this down to this commit d1903fb

This made an enormous backtrace which I removed all traces without pkg/sftp in and deduplicated the traces which left us with three.

I think backtrace 1 is irrelevant. What appears to have happened is that a goroutine started (backtrace 3) which is holding the File lock

The code from rclone which causes this is this (backtrace 3)

func (f *Fs) newObjectReader(sftpFile *sftp.File) *objectReader {
	pipeReader, pipeWriter := io.Pipe()
	file := &objectReader{
		f:          f,
		sftpFile:   sftpFile,
		pipeReader: pipeReader,
		done:       make(chan struct{}),
	}
	// Show connection in use
	f.addSession()

	go func() {
		// Use sftpFile.WriteTo to pump data so that it gets a
		// chance to build the window up.
		_, err := sftpFile.WriteTo(pipeWriter)
		// Close the pipeWriter so the pipeReader fails with
		// the same error or EOF if err == nil
		_ = pipeWriter.CloseWithError(err)
		// signal that we've finished
		close(file.done)
	}()

	return file
}

And here (backtrace 2)

// Close a reader of a remote sftp file
func (file *objectReader) Close() (err error) {
	// Close the sftpFile - this will likely cause the WriteTo to error
	err = file.sftpFile.Close()
	// Close the pipeReader so writes to the pipeWriter fail
	_ = file.pipeReader.Close()
	// Wait for the background process to finish
	<-file.done
	// Show connection no longer in use
	file.f.removeSession()
	return err
}

So the problem is that rclone expects the line err = file.sftpFile.Close() to close the sftp File and return an error, while now what happens is that because of the locking added in d1903fb the Close waits indefinitely for the File.WriteTo() to finish which it won't since rclone has stopped putting data into it.

So I think this is a regression - this has worked perfectly for many years!

I haven't made a small reproduction - I probably can now I understand what is going on - but maybe you've got an immediate idea on how to solve this problem?

Backtrace 1

  • goroutine 56 gp=0xc0004048c0 m=nil [sync.Cond.Wait]:
  • runtime.gopark(0xc0000a5ce0?, 0x1?, 0x20?, 0x24?, 0x23?)
    • go1.23/src/runtime/proc.go:424 +0xce fp=0xc000352430 sp=0xc000352410 pc=0x473b6e
  • runtime.goparkunlock(...)
    • go1.23/src/runtime/proc.go:430
  • sync.runtime_notifyListWait(0xc00011e0d0, 0x32c)
    • go1.23/src/runtime/sema.go:587 +0x159 fp=0xc000352480 sp=0xc000352430 pc=0x4753b9
  • sync.(*Cond).Wait(0x0?)
    • go1.23/src/sync/cond.go:71 +0x85 fp=0xc0003524c0 sp=0xc000352480 pc=0x484dc5
  • golang.org/x/crypto/ssh.(*buffer).Read(0xc00007c060, {0xc0001167f8, 0x4, 0x4})
    • /home/ncw/go/pkg/mod/golang.org/x/crypto@v0.29.0/ssh/buffer.go:94 +0x1ed fp=0xc000352548 sp=0xc0003524c0 pc=0x66eb6d
  • golang.org/x/crypto/ssh.(*channel).ReadExtended(0xc0003e40c0, {0xc0001167f8?, 0x1c0004048c0?, 0xc0003525d8?}, 0x40811e?)
    • /home/ncw/go/pkg/mod/golang.org/x/crypto@v0.29.0/ssh/channel.go:363 +0x93 fp=0xc0003525a8 sp=0xc000352548 pc=0x672553
  • golang.org/x/crypto/ssh.(*channel).Read(0x408245?, {0xc0001167f8?, 0x471f94?, 0xc000400e08?})
    • /home/ncw/go/pkg/mod/golang.org/x/crypto@v0.29.0/ssh/channel.go:540 +0x25 fp=0xc0003525e0 sp=0xc0003525a8 pc=0x673265
  • github.com/pkg/sftp.(*conn).Read(0x0?, {0xc0001167f8?, 0xc000352648?, 0x475e49?})
    • <autogenerated>:1 +0x29 fp=0xc000352610 sp=0xc0003525e0 pc=0x6b0029
  • io.ReadAtLeast({0xb6e6c0, 0xc0000ea240}, {0xc0001167f8, 0x4, 0x4}, 0x4)
    • go1.23/src/io/io.go:335 +0x90 fp=0xc000352658 sp=0xc000352610 pc=0x4a93f0
  • io.ReadFull(...)
    • go1.23/src/io/io.go:354
  • github.com/pkg/sftp.recvPacket({0xb6e6c0, 0xc0000ea240}, 0x0, 0x2?)
  • github.com/pkg/sftp.(*conn).recvPacket(...)
  • github.com/pkg/sftp.(*clientConn).recv(0xc0000ea240)
  • github.com/pkg/sftp.NewClientPipe.func1()
  • runtime.goexit({})
    • go1.23/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0003527e8 sp=0xc0003527e0 pc=0x47bf21
  • created by github.com/pkg/sftp.NewClientPipe in goroutine 46

Backtrace 2

  • goroutine 1762 gp=0xc0003a96c0 m=nil [sync.Mutex.Lock]:
  • runtime.gopark(0x0?, 0xc000034700?, 0xc0?, 0x20?, 0xc00001fbb8?)
    • go1.23/src/runtime/proc.go:424 +0xce fp=0xc00001fad8 sp=0xc00001fab8 pc=0x473b6e
  • runtime.goparkunlock(...)
    • go1.23/src/runtime/proc.go:430
  • runtime.semacquire1(0xc00038e06c, 0x0, 0x3, 0x1, 0x15)
    • go1.23/src/runtime/sema.go:178 +0x225 fp=0xc00001fb40 sp=0xc00001fad8 pc=0x450725
  • sync.runtime_SemacquireMutex(0xc00001fbb0?, 0xe5?, 0xb?)
    • go1.23/src/runtime/sema.go:95 +0x25 fp=0xc00001fb78 sp=0xc00001fb40 pc=0x475105
  • sync.(*Mutex).lockSlow(0xc00038e068)
    • go1.23/src/sync/mutex.go:173 +0x15d fp=0xc00001fbc8 sp=0xc00001fb78 pc=0x4864fd
  • sync.(*Mutex).Lock(...)
    • go1.23/src/sync/mutex.go:92
  • sync.(*RWMutex).Lock(0x2?)
    • go1.23/src/sync/rwmutex.go:148 +0x31 fp=0xc00001fbf8 sp=0xc00001fbc8 pc=0x4877d1
  • github.com/pkg/sftp.(*File).Close(0xc00038e050)
  • github.com/rclone/rclone/backend/sftp.(*objectReader).Close(0xc0002ac680)
    • github.com/rclone/rclone/backend/sftp/sftp.go:2105 +0x1f fp=0xc00001fc90 sp=0xc00001fc58 pc=0x8cacbf
  • github.com/rclone/rclone/lib/readers.(*LimitedReadCloser).Close(0xc000338480)
    • github.com/rclone/rclone/lib/readers/limited.go:17 +0x24 fp=0xc00001fce0 sp=0xc00001fc90 pc=0x7c8d44
  • github.com/rclone/rclone/fstest/fstests.ReadObject({0xb73360, 0xf3e900}, 0xc000312ea0, {0xb77540, 0xc000486e80}, 0xffffffffffffffff, {0xc00061e350, 0x1, 0x1})
    • github.com/rclone/rclone/fstest/fstests/fstests.go:293 +0x2f5 fp=0xc00001fdb0 sp=0xc00001fce0 pc=0x925895
  • github.com/rclone/rclone/fstest/fstests.Run.func13.14.36(0xc000312ea0)
    • github.com/rclone/rclone/fstest/fstests/fstests.go:1802 +0x310 fp=0xc00001ff70 sp=0xc00001fdb0 pc=0x933db0
  • testing.tRunner(0xc000312ea0, 0xc000486e00)
    • go1.23/src/testing/testing.go:1690 +0xf4 fp=0xc00001ffc0 sp=0xc00001ff70 pc=0x526174
  • testing.(*T).Run.gowrap1()
    • go1.23/src/testing/testing.go:1743 +0x25 fp=0xc00001ffe0 sp=0xc00001ffc0 pc=0x527165
  • runtime.goexit({})
    • go1.23/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00001ffe8 sp=0xc00001ffe0 pc=0x47bf21
  • created by testing.(*T).Run in goroutine 1096
    • go1.23/src/testing/testing.go:1743 +0x390

Backtrace 3

  • goroutine 1676 gp=0xc000160e00 m=nil [select]:
  • runtime.gopark(0xc000416cf0?, 0x2?, 0x98?, 0x6b?, 0xc000416cb4?)
    • go1.23/src/runtime/proc.go:424 +0xce fp=0xc000416b50 sp=0xc000416b30 pc=0x473b6e
  • runtime.selectgo(0xc000416cf0, 0xc000416cb0, 0xc0001ef470?, 0x1, 0x40f905?, 0x1)
    • go1.23/src/runtime/select.go:335 +0x7a5 fp=0xc000416c78 sp=0xc000416b50 pc=0x44f685
  • io.(*pipe).write(0xc000112f00, {0xc0003f0000, 0x5f, 0x8000})
    • go1.23/src/io/pipe.go:86 +0x1e5 fp=0xc000416d50 sp=0xc000416c78 pc=0x4ab185
  • io.(*PipeWriter).Write(0xc00038e050?, {0xc0003f0000?, 0xc0003f0000?, 0x8000?})
    • go1.23/src/io/pipe.go:161 +0x1a fp=0xc000416d80 sp=0xc000416d50 pc=0x4ab59a
  • github.com/pkg/sftp.(*File).writeToSequential(0xc00038e050, {0xb6e140, 0xc000112f00})
  • github.com/pkg/sftp.(*File).WriteTo(0xc00038e050, {0xb6e140, 0xc000112f00})
  • github.com/rclone/rclone/backend/sftp.(*Fs).newObjectReader.func1()
    • github.com/rclone/rclone/backend/sftp/sftp.go:2085 +0x30 fp=0xc000416fe0 sp=0xc000416fa8 pc=0x8cac10
  • runtime.goexit({})
    • go1.23/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000416fe8 sp=0xc000416fe0 pc=0x47bf21
  • created by github.com/rclone/rclone/backend/sftp.(*Fs).newObjectReader in goroutine 1762
    • github.com/rclone/rclone/backend/sftp/sftp.go:2082 +0x199

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions