Let’s start with the assumption that AI advancements are good for the planet. From above the fray of popular controversy, AI has already facilitated more discoveries than we can list here — and will continue to help solve currently intractable problems and drive significant progress in science.
However, AI does have a dark side that plagues researchers: the amount of energy consumed in developing, training and executing AI models can have an undeniable impact on the environment.
This is where the innovations being developed by Pryon scientists come into play as they strive to achieve the highest possible accuracy for Pryon users, while minimizing power consumption during both model training and system deployment.
Eco-friendly model training
One such innovation reduces the energy required to train natural language processing (NLP) models that are used for document search and ranking.
It is well known that search and ranking models are best trained in a “list-wise” setting in which every query is processed with positive examples as well as a large number of negative examples. This requires GPUs that have a large amount of memory in addition to compute power. With innovative methods for the selection of negative examples, as well as stochastic approximations to the objective function, we significantly reduce the memory and compute requirements of such training, as demonstrated in the figure below.
Standard vs. eco-friendly training
The blue curve depicts the standard training procedure which, in this example, achieves an accuracy of about 76% after 20 hours of compute on an 8-way NVIDIA V100 GPU machine. The orange curve depicts Pryon’s novel eco-friendly training algorithm, which achieves 76% accuracy in about 5 hours, and even achieves above 77% eventual accuracy due to better utilization of a larger pool of negative samples.
Energy-efficient resource management at run-time
Another environment-friendly innovation at Pryon is our management of compute resources at system run-time.
This process works by continually sensing the system computing load and amount of resources consumed, and rather than “autoscaling” to allocate more resources in response to high load (a common practice), it instead reduces the amount of work the system does per request, thereby reducing or even eliminating the need for extra resources.
Accuracy is only minimally affected during these times of energy conservation, as the system dynamically allocates resources based on availability, the inferred difficulty of each request, and the number of requests. Once the load spike resolves, the system reverts back to normal energy and accuracy levels automatically.
While debate swirls around the benchmarks used for measuring the energy consumption and carbon footprint of AI systems, the AI industry continues to innovate with greater efficiency in mind. Here we have described what are just two examples of projects that Pryon is working on to push the boundaries of interactive AI while ensuring a greener tomorrow.