In the era of AI and machine learning, businesses are tempted to buy ever more hardware for training new models and keeping up with the progress. This can be particularly challenging for small and medium-sized companies, as hardware for AI applications is often very expensive.
In many cases, however, it is not necessary to buy new hardware. Companies often have hundreds of computers for their employees, most of which are only used during the day. These resources can be used at night to train AI models.
However, it is not easy to access and use these resources. This is where the pyremto package comes into play. With pyremto, you can create AI jobs that can then be processed by the various workers at night. This happens adaptively and automatically, so you don’t have to know and plan the use of resources in advance. You can easily deploy the worker software via your software management system – and thus make unused resources available for an important purpose.
For which tasks can I use existing hardware in my company?
Companies often have a large pool of computers. These represent many computing cores that can be used for computationally intensive tasks. Due to the decentralized nature, however, there are limits regarding the storage space for models. Very large models, such as large language models, will be difficult to train in this way.
However, most models require relatively little storage space. This applies, for example, to models from the fields of image classification, object detection, regression, etc. Here, a lot of computing resources in terms of processor cores are required during training. These tasks are therefore ideally suited to utilizing unused resources in the company.
To summarize, it is particularly worthwhile to use your unused resources if your model fits into the RAM of a normal computer and the training is very computationally intensive. This is the case, for example, if you have a lot of data for the training, i.e. you have to make a lot of training steps during the training. Then it is worth switching to a federated learning architecture and including all resources, which have been unused until now.
How does that work?
In order to make use of all your resources, you need to change your training scheme from a centralised, server-based training to a federated learning architecture. With federated learning, you deploy many learning workers, which optimise the overall model in parallel. You can use the pyremto package for orchestrating the federated learning process.
The overall process consists of the following steps:
1. Define the training steps which have to be performed during each training epoch.
2. Deploy the worker code to all computers in your company, which have long idle times (for example during the night).
3. Start the training and progressively obtain a better model by training on all available worker nodes in parallel.
Note that you still need a central server in your intranet, which hosts the training data needed by the workers.
Use unused resources for your machine learning training tasks
If you are currently faced with the question of whether to buy AI hardware for your company, you should first evaluate whether you can handle the tasks with the hardware you already have. This is possible in many cases, and a way to save a lot of money.
The implementation of this approach depends heavily on your requirements and your setup. The following questions play a role here:
- What is the aim of the training? Based on this question, a suitable federated learning training scheme must be designed.
- Is there an existing software management system? This can be used to distribute the worker nodes – otherwise a separate solution must be designed for this.
- Is a central server available for data distribution and the merging of federated learning results? In contrast to a training server, this server requires far less computing power, and hence is way cheaper (this can be one of the existing computers in your network).
Since there is no standard solution for the entire process, please contact us if you are interested in such a solution. We will then be happy to advise you on how you can implement the project in your company.