I am fetching a list of items from an API endpoint. Then for each item I make another API request to get data about the individual item.
I can't make the second API request for every item concurrently, because my API token has a rate limit and I'll get throttled if I make too many requests at the same time.
However the initial API response data can be split into pages, which allows me to process pages of data concurrently.
After doing some research, the code below does exactly what I want:
func main() {
// pretend paginated results from initial API request
page1 := []int{1, 2, 3}
page2 := []int{4, 5, 6}
page3 := []int{7, 8, 9}
pages := [][]int{page1, page2, page3}
results := make(chan string)
var wg sync.WaitGroup
for i := range pages {
wg.Add(1)
go func(i int) {
defer wg.Done()
for j := range pages[i] {
// simulate making additional API request and building the report
time.Sleep(500 * time.Millisecond)
result := fmt.Sprintf("Finished creating report for %d", pages[i][j])
results <- result
}
}(i)
}
go func() {
wg.Wait()
close(results)
}()
for result := range results {
fmt.Println(result)
}
}
I'd like to understand why this is what makes it work:
go func() {
wg.Wait()
close(results)
}()
My first try didn't work -- I thought I could range over the channel after wg.Wait()
and I'd read the results as they were written to the results
channel.
func main() {
// pretend paginated results from initial API request
page1 := []int{1, 2, 3}
page2 := []int{4, 5, 6}
page3 := []int{7, 8, 9}
pages := [][]int{page1, page2, page3}
results := make(chan string)
var wg sync.WaitGroup
for i := range pages {
wg.Add(1)
go func(i int) {
defer wg.Done()
for j := range pages[i] {
// simulate making additional API request and building the report
time.Sleep(500 * time.Millisecond)
result := fmt.Sprintf("Finished creating report for %d", pages[i][j])
results <- result
}
}(i)
}
// does not work
wg.Wait()
close(results)
for result := range results {
fmt.Println(result)
}
}
In your first attempt:
In your second attempt: