Peeking under the hood (JNNC Technologies)

If you're not familiar with the architecture of the Go compiler, I suggest checking out my earlier article on Go compiler internals - part 1, part 2.

The concrete syntax tree created by the parser creates the following note for the go statement calling the closure:

0: *syntax.CallStmt {
.  Tok: go
.  Call: *syntax.CallExpr {
.  .  Fun: *syntax.FuncLit {
.  .  .  Type: *syntax.FuncType {
.  .  .  .  ParamList: nil
.  .  .  .  ResultList: nil
.  .  .  }
.  .  .  Body: *syntax.BlockStmt {
.  .  .  .  List: []syntax.Stmt (1 entries) {
.  .  .  .  .  0: *syntax.ExprStmt {
.  .  .  .  .  .  X: *syntax.CallExpr {
.  .  .  .  .  .  .  Fun: foobyval @ go-closure.go:15:4
.  .  .  .  .  .  .  ArgList: []syntax.Expr (1 entries) {
.  .  .  .  .  .  .  .  0: i @ go-closure.go:15:13
.  .  .  .  .  .  .  }
.  .  .  .  .  .  .  HasDots: false
.  .  .  .  .  .  }
.  .  .  .  .  }
.  .  .  .  }
.  .  .  .  Rbrace: syntax.Pos {}
.  .  .  }
.  .  }
.  .  ArgList: nil
.  .  HasDots: false
.  }
}

The function called is represented by a FuncLit node - a function literal. When this tree is converted into the AST, outlining the function literal into a standalone function is one of the outcomes. This is done in the noder.funcLit method that lives in gc/closure.go.

The type checker completes the transformation and we get this AST node for the outlined function representing the closure:

main.func1:
.   DCLFUNC l(14) tc(1) FUNC-func()
.   DCLFUNC-body
.   .   CALLFUNC l(15) tc(1)
.   .   .   NAME-main.foobyval a(true) l(8) x(0) class(PFUNC) tc(1) used FUNC-func(int)
.   .   CALLFUNC-list
.   .   .   NAME-main.i l(15) x(0) class(PAUTOHEAP) tc(1) used int

Note that the value passed into foobyval is NAME-main.i, directly referencing the variable from the function enclosing the closure.

At this point comes the stage of the compiler most relevant to our exploration - capturevars. Its goal is to decide how to capture closed variables (i.e. free variables used in closures). Here's a comment for the relevant function in the compiler, which also describes the heuristic:

// capturevars is called in a separate phase after all typechecking is done.
// It decides whether each variable captured by a closure should be captured
// by value or by reference.
// We use value capturing for values <= 128 bytes that are never reassigned
// after capturing (effectively constant).

When it runs on example 5, capturevars marks the loop variable i as captured by reference, and adds a addrtaken flag to it. This is visible in the AST:

FOR l(13) tc(1)
.   LT l(13) tc(1) bool
.   .   NAME-main.i a(true) g(1) l(13) x(0) class(PAUTOHEAP) esc(h) tc(1) addrtaken assigned used int

For the loop variable, the heuristic for value capturing doesn't apply because the value is reassigned after the call (recall the quote from the spec which says that loop vars are re-used between iterations). Therefore, i is captured by reference.

In the variation where we have ii := i before the go statement, ii is not reassigned after the goroutine is launched, so it's captured by value [4].

Therefore, we see a fascinating example of two language features interacting in unexpected ways. Instead of defining a new variable for each iteration, Go reuses the same one. This, in turn, leads to the capturevars heuristic to tag it for reference capturing, which leads to unexpected output. The Go FAQ admits this behavior may have been a design mistake:

This behavior of the language, not defining a new variable for each iteration, may have been a mistake in retrospect. It may be addressed in a later version but, for compatibility, cannot change in Go version 1.

Once you're aware of the problem, it shouldn't cause you much problems in realistic scenarios - just be wary of free variables captured by closures. Always assume they can be captured by reference, unless you explicitly checked that the capture is by value. To avoid mistakes, only read-only values should be left to be implicitly captured by closures invoked in goroutines; this makes sense from a concurrency standpoint as well.

[1] Here and elsewhere in this post, I'm using time.Sleep as a quick and hacky way to wait for all the spawned goroutines to finish. Without this, main will return before the other goroutine start running. The correct way to wait for goroutines in real code would be something like a WaitGroup or a done channel.

[2]	The disassembly for all the samples in this post is obtained by calling the Go compiler directly with `go tool compile -l -S`. The `-l` flag disables function inlining, which makes the resulting assembly more readable.

[3]	`foobyval` is not called directly because it's invoked in a `go` statement. Rather, we pass its address as the second argument (at `16(SP)`) to `runtime.newproc`, and the arguments to `foobyval` (`i` in this case) follow higher on the stack.

[4]	As an exercise, add a dummy `ii = 10` assignment as the very last statement in the `for` body (after the `go` closure call). What is the output? Why?

https://jnnctechnologies.com/category/blog/

https://twitter.com/jnnctechnologie

https://www.linkedin.com/company/14747569/admin/

https://www.facebook.com/jnnctechnologies.software/?ref=bookmarks

https://www.youtube.com/channel/UCDul6CfM-kNUhfTO597SCWQ