Let’s control the concurrent number of Goroutine.

  Back end, erupt simultaneously, golang, php

image

Original address:Let’s control the concurrent number of Goroutine.

Problem

func main() {
    userCount := math.MaxInt64
    for i := 0; i < userCount; i++ {
        go func(i int) {
            // 做一些各种各样的业务逻辑处理
            fmt.Printf("go func: %d\n", i)
            time.Sleep(time.Second)
        }(i)
    }
}

Here, supposeuserCountIt is an externally passed-in parameter (unpredictable, possibly with a very large value), and someone will throw it all into the loop. Think of all and goroutine to do something at the same time. I think it will be more efficient, right?

So, do you think there is any problem here?

A nightmare start

Of course, inUnder specific scenes, the problem is big. Because when this article is thrown in, the concurrent value is an extreme value. We can watch the index analysis in the figure below together to see how “collapsed” the situation is. The following figure shows the above code:

Output results

...
go func: 5839
go func: 5840
go func: 5841
go func: 5842
go func: 5915
go func: 5524
go func: 5916
go func: 8209
go func: 8264
signal: killed

If you have executed the code yourself, you will encounter the following problems in “output results”:

  • The occupancy rate of system resources keeps rising.
  • After outputting a certain amount: the console will no longer refresh the output of the latest value.
  • Semaphore: signal: killed

System load

image

CPU

image

In a short period of time, the system load increases sharply.

virtual memory

image

The amount of virtual memory occupied in a short period of time has exploded.

top

PID    COMMAND      %CPU  TIME     #TH   #WQ  #PORT MEM    PURG   CMPRS  PGRP  PPID  STATE    BOOSTS
...
73414  test         100.2 01:59.50 9/1   0    18    6801M+ 0B     114G+  73403 73403 running  *0[1]

Summary

If you look at the schematic diagram of the monitoring tool carefully, you can know that I actually executed it twice at intervals, and can see that the utilization rate between systems is very large. When the process was killed, the whole returned to normal.

Here, let’s return to the theme, that is, inDo not control the number of concurrent goroutineWhat will happen? Roughly as follows:

  • CPU usage fluctuates up.
  • Memory occupancy continues to rise. You can also look at CMPRS, which represents the number of bytes of compressed data for a process. It has reached 114G+
  • The main process crashed (was killed)

Simply put, the reason for the “crash” is that it takes up too much system resources. Common examples include: too many files open, memory usage, etc

Harm

It has a great impact on this server, affecting itself and its associated applications. It is likely to lead to unavailability or slow response. In addition, a plurality of “out-of-control” goroutine are started, resulting in confusion in program flow.

Solution

I spent a lot of space in the front, rendering the “serious” problem that will occur if there is a large number of concurrent goroutine. Then I think about the solution together. As follows:

  1. Control/limit the number of concurrent runs of goroutine at the same time
  2. Change the logic of the application (avoid large-scale use of system resources and waiting)
  3. Adjust the service’s hardware configuration, maximum number of opens, memory and other thresholds

Controls the number of goroutine concurrency

Next, we will officially begin to solve this problem. I hope you can read it carefully and think about it at the same time, because this problem is really common in actual projects.

The problem has been thrown out, what you need to do isThink about what you can doTo solve this problem. It is suggested that you think about the technical scheme on your own. Then look down:-)

Try chan

func main() {
    userCount := 10
    ch := make(chan bool, 2)
    for i := 0; i < userCount; i++ {
        ch <- true
        go Read(ch, i)
    }
    
    //time.Sleep(time.Second)
}

func Read(ch chan bool, i int) {
    fmt.Printf("go func: %d\n", i)
    <- ch
}

Output results:

go func: 1
go func: 2
go func: 3
go func: 4
go func: 5
go func: 6
go func: 7
go func: 8
go func: 0

Well, we seem to have a good control over the “sequence” of two to execute multiple goroutine. However, problems have arisen. Do you count the output results carefully, only 9 values?

This is obviously wrong. The reason is that when the main coordination process ends, the sub-coordination process will also be terminated. So the rest of the goroutine did not come and output the value and was sent on the roadtime.SleepOpen it and look at the output quantity)

Try sync

...
var wg = sync.WaitGroup{}

func main() {
    userCount := 10
    for i := 0; i < userCount; i++ {
        wg.Add(1)
        go Read(i)
    }

    wg.Wait()
}

func Read(i int) {
    defer wg.Done()
    fmt.Printf("go func: %d\n", i)
}

Well, pure usesync.WaitGroupNeither can I. The number of concurrent goroutine is not controlled.

Summary

SimpleSimpleThe use of channel or sync has obvious defects and cannot be done. Let’s see if component coordination can be realized.

Try chan+sync

...
var wg = sync.WaitGroup{}

func main() {
    userCount := 10
    ch := make(chan bool, 2)
    for i := 0; i < userCount; i++ {
        wg.Add(1)
        go Read(ch, i)
    }

    wg.Wait()
}

func Read(ch chan bool, i int) {
    defer wg.Done()

    ch <- true
    fmt.Printf("go func: %d, time: %d\n", i, time.Now().Unix())
    time.Sleep(time.Second)
    <-ch
}

Output results:

go func: 9, time: 1547911938
go func: 1, time: 1547911938
go func: 6, time: 1547911939
go func: 7, time: 1547911939
go func: 8, time: 1547911940
go func: 0, time: 1547911940
go func: 3, time: 1547911941
go func: 2, time: 1547911941
go func: 4, time: 1547911942
go func: 5, time: 1547911942

Judging from the output results, it is true to control goroutine to execute our “business logic” in the number of 2 and 2. of course, the result set should also be output out of order

Scheme 1: Simple Semaphore

After establishing that it is feasible to simply use chan+sync, we repackaged the flow logic asgsema, the main program becomes as follows:

import (
    "fmt"
    "time"

    "github.com/EDDYCJY/gsema"
)

var sema = gsema.NewSemaphore(3)

func main() {
    userCount := 10
    for i := 0; i < userCount; i++ {
        go Read(i)
    }

    sema.Wait()
}

func Read(i int) {
    defer sema.Done()
    sema.Add(1)

    fmt.Printf("go func: %d, time: %d\n", i, time.Now().Unix())
    time.Sleep(time.Second)
}

Analysis plan

In the above code, the program execution flow is as follows:

  • Set the number of concurrent allowed to be 3
  • Loop 10 times, starting one goroutine at a time to perform the task.
  • Every goroutine is used internally.semaWhether the regulation is blocked or not
  • According to the number of concurrent allowed, goroutine is gradually released and the task is finally ended.

It looks like a human being. There is no serious problem. However, there is a “big” pit, and seriously see the second point, “one goroutine at a time.” HereThere is a problem.Will there be any problem in generating so many goroutine ahead of time? next, let’s analyze the advantages and disadvantages as follows:

Profit

  • Suitable forThe quantity is not large and the complexity is low.The usage scenario of

    • Hundreds of thousands, hundreds of thousands are also acceptable (see specific business scenarios)
    • The actual business logic has been blocked and waited before running (due to limited concurrency), and the performance of the basic actual business logic loss is greater than that of the goroutine itself.
    • Goroutine itself is very light and consumes very little memory space and scheduling. This kind of waiting for response is lying down, waiting for the task to wake up
  • Semaphore has low operation complexity, simple flow and easy control.

harm

  • Not suitableLarge quantity and high complexityThe usage scenario of

    • If there are millions or tens of millions of go routines, it wastes a lot of scheduling go routines and memory space. If your server just can’t accept it
  • Semaphore is more complex to operate and needs to manage more states.

Summary

  • Based on what business scenario, use what plan to do things
  • There is enough time to allow you to pursue better and more extreme solutions (using a third-party library is fine)

I think it is OK to think mainly based on the above two points. There is no right or wrong, only the current business scenario can be accepted, and whether this pre-started goroutine quantity can be accepted by your system

Of course, common/simple Go applications can basically solve the problem by adopting such technical solutions. Because there are so many situations like “problems” in the first section of this article, the situation is very few. It does not have those “particularities”. Therefore, it is basically OK to use this scheme.

Flexible control of the number of goroutine concurrent.

A tight little hand. Lao Wang found a new problem next door. In “scheme 1”, inI/O integrationIn the case of, in common business scenarios can indeed

However, this new business scenario is quite special, and the number of inputs should be controlled to achieveChange the number of goroutine allowed to run concurrently. Let’s think about it carefully and make the following changes:

  • Input/output can only be controlled separately if it is pulled away.
  • The input/output should be variable and should be in for-loop (where values can be set)
  • It is allowed to change the concurrent number of goroutine, but it must also have oneMaximum value(because allow changes is relative)

Option 2: flexible chan+sync

package main

import (
    "fmt"
    "sync"
    "time"
)

var wg sync.WaitGroup

func main() {
    userCount := 10
    ch := make(chan int, 5)
    for i := 0; i < userCount; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            for d := range ch {
                fmt.Printf("go func: %d, time: %d\n", d, time.Now().Unix())
                time.Sleep(time.Second * time.Duration(d))
            }
        }()
    }

    for i := 0; i < 10; i++ {
        ch <- 1
        ch <- 2
        //time.Sleep(time.Second)
    }

    close(ch)
    wg.Wait()
}

Output results:

...
go func: 1, time: 1547950567
go func: 3, time: 1547950567
go func: 1, time: 1547950567
go func: 2, time: 1547950567
go func: 2, time: 1547950567
go func: 3, time: 1547950567
go func: 1, time: 1547950568
go func: 2, time: 1547950568
go func: 3, time: 1547950568
go func: 1, time: 1547950568
go func: 3, time: 1547950569
go func: 2, time: 1547950569

In “Plan 2”, we can do the following according to new business requirements anytime and anywhere:

  • Change the input quantity of channel
  • The cyclic value of channel can be changed according to special circumstances.
  • Change the maximum number of concurrent goroutine allowed

In general, the controllable space has been released as much as possible. Is it more flexible?-)

Scheme 3: Third Party Library

There are also many mature third-party libraries, which are basically pool tools aiming at generating and managing goroutine. I made a few simple lists, and specifically suggested that everyone read the source code or look for it more. The principle is similar.

Summary

At the beginning of this article, I spent a lot of effort (extreme amount) to tell youAt the same time, the number of concurrent goroutine will cause the system to occupy more resources. In the end, the extreme situation of the service collapse. I hope you can avoid this kind of problem in the future and leave a deep impression on you.

Next, we launched an analysis on the theme of “controlling the number of concurrent goroutine.” Three schemes are given respectively. In my opinion, each has its own advantages and disadvantages. I suggest youSelect the appropriate technical scheme for your own scene.Just do it

Because, there are different types of technical solutions can also solve this problem, thousands of people. This article recommends more common solutions, and we also welcome you to continue to add in the comment area:-)