We have a lot of servers and a lot of teams. What kind of architecture can we use to make teams submit their jobs on the web page? Most of the jobs are probably Python jobs. When submitting a job, the team can choose how many resources they want (such as how much memory, how much CPU, and how many GPU). Then the web page outputs the results to them (or stores them in a designated location), and they do not need to log on to the server.
At present, some environments are Linux, kubernetes, docker. May I ask what kind of software architecture will be used to fulfill these requirements mentioned above?
Whether to consider running each job as a container and restrict the resources of the container.
Scheduled tasks can be set on jenkins (or manually), and the container will only start when needed, reducing unnecessary resource occupation.
Each execution result is in the container log, which can be printed on jenkins console or collected with elk.