I am trying to create a simple matchmaking server to match two clients to each other. Here's the structure
.
├── bootstrap-server
├── cert.pem.local
├── client.go
├── config.go
├── go.mod
├── go.sum
├── key.pem.local
└── main.go
1 directory, 8 files
main.go
package main
import (
"context"
"log"
"net/http"
"time"
_ "github.com/joho/godotenv/autoload"
)
var queue = NewClientQueue()
func main() {
cfg := newConfigFromEnv()
mux := http.NewServeMux()
mux.HandleFunc("GET /", handleFindPeer)
log.Println("Bootstrap server running on", cfg.serverAddr)
log.Fatal(http.ListenAndServeTLS(cfg.serverAddr, cfg.tlsCert, cfg.tlsKey, mux))
}
func handleFindPeer(w http.ResponseWriter, r *http.Request) {
const pairTimeout = time.Minute
client := Client{
addr: r.RemoteAddr,
respWriter: w,
peerMatchedCh: make(chan struct{}),
}
// Use 30 seconds timeout for finding a connection
ctx, cancel := context.WithTimeout(r.Context(), pairTimeout)
defer cancel()
go func() {
if queue.Len() > 0 {
peer := queue.Dequeue()
sendPairingResponse(client, peer)
close(client.peerMatchedCh)
close(peer.peerMatchedCh)
log.Printf("Paired %v and %v\n", client.addr, peer.addr)
} else {
queue.Enqueue(client)
log.Printf("Added %v to waiting queue\n", client.addr)
}
}()
select {
case <-client.peerMatchedCh:
return
case <-ctx.Done():
http.Error(w, "timeout waiting for peer", http.StatusRequestTimeout)
}
}
func sendPairingResponse(c1, c2 Client) {
c1.respWriter.WriteHeader(http.StatusOK)
c2.respWriter.WriteHeader(http.StatusOK)
c1.respWriter.Write([]byte(c2.addr))
c2.respWriter.Write([]byte(c1.addr))
}
client.go
package main
import (
"net/http"
"sync"
)
type Client struct {
addr string
respWriter http.ResponseWriter
peerMatchedCh chan struct{}
}
type ClientQueue struct {
clients []Client
mu sync.Mutex
}
func NewClientQueue() *ClientQueue {
return &ClientQueue{clients: []Client{}, mu: sync.Mutex{}}
}
func (q *ClientQueue) Enqueue(c Client) {
q.mu.Lock()
defer q.mu.Unlock()
q.clients = append(q.clients, c)
}
func (q *ClientQueue) Dequeue() Client {
q.mu.Lock()
defer q.mu.Unlock()
client := q.clients[0] // <-- panicking here
q.clients = q.clients[1:]
return client
}
func (q *ClientQueue) Remove(client Client) {
q.mu.Lock()
defer q.mu.Unlock()
for i, c := range q.clients {
if c.addr == client.addr {
q.clients = append(q.clients[:i], q.clients[i+1:]...)
return
}
}
}
func (q *ClientQueue) Len() int {
q.mu.Lock()
defer q.mu.Unlock()
return len(q.clients)
}
I am opening lots of requests using curl
while true; do curl --cacert cert.pem.local https://localhost:3030 & ; done
I am running the server with go run .
and after few seconds, I get an out of index panic
....
2024/10/24 13:54:41 Paired [::1]:57753 and [::1]:57755
2024/10/24 13:54:41 Added 127.0.0.1:57759 to waiting queue
2024/10/24 13:54:41 Paired 127.0.0.1:57760 and 127.0.0.1:57759
2024/10/24 13:54:41 Paired [::1]:57820 and 127.0.0.1:57756
2024/10/24 13:54:41 Added [::1]:57792 to waiting queue
panic: runtime error: index out of range [0] with length 0
goroutine 8864 [running]:
main.(*ClientQueue).Dequeue(0x104833f00?)
/Users/lufy/Developer/pg/go/p2p/bootstrap-server/client.go:32 +0x1a0
main.handleFindPeer.func1()
/Users/lufy/Developer/pg/go/p2p/bootstrap-server/main.go:38 +0x44
created by main.handleFindPeer in goroutine 9330
/Users/lufy/Developer/pg/go/p2p/bootstrap-server/main.go:36 +0x118
exit status 2
I also get this panic
....
2024/10/24 14:07:47 Added [::1]:60624 to waiting queue
2024/10/24 14:07:47 http: TLS handshake error from 127.0.0.1:60680: EOF
panic: WriteHeader called after Handler finished
goroutine 4934 [running]:
net/http.(*http2responseWriter).WriteHeader(0x140000fe380?, 0x104a51620?)
/usr/local/go/src/net/http/h2_bundle.go:6773 +0x44
main.sendPairingResponse({{0x1400032eeb5, 0xb}, {0x104b55bd8, 0x14000502010}, 0x14000100540}, {{0x140003c8585, 0xb}, {0x104b55bd8, 0x140000a8040}, 0x140000fe380})
/Users/lufy/Developer/pg/go/p2p/bootstrap-server/main.go:58 +0x54
main.handleFindPeer.func1()
/Users/lufy/Developer/pg/go/p2p/bootstrap-server/main.go:39 +0xb0
created by main.handleFindPeer in goroutine 4933
/Users/lufy/Developer/pg/go/p2p/bootstrap-server/main.go:36 +0x118
exit status 2
I have no idea how these panics are happening. Please help
There's a plain and simple logical race between calls to Len()
, Dequeue()
and Enqueue()
: once the mutex in Len()
is unlocked and before it's locked again in Enqueue()
or Dequeue()
—based on what the length of the slice was during the time the mutex was locked, there can happen any number of other calls to Enqueue()
and Dequeue()
which are being executed in goroutines running concurrently.
The fix is to have calls such as DequeueOrEnqueue()
which 1) lock the mutex; 2) test the queue length; 3) perform either queueing or dequeueing based on that test and then 4) unlock the mutex.
I would suggest a simple mental technique for reasoning about program code implementing something like your case: make yourself think all the code is working very very slow, and it even pauses for indeterminate amounts of time at line breaks and curly braces.
Taking your particular case, suppose the program pauses for, like, 5 seconds once it has evaluated the if
condition—that is, it knows what branch it's going to take but the executing goroutine is preempted for whatever reason and just sits there.
This may sound silly but in production code two things do happen, which basically lead to the same observable results:
curl
requests can be handled strictly in parallel (that's an oversimplification, but let's not digress).