This article was taken from the paper “AI & Water management — What utilities need to know now.” You’ll find the full paper here.
It’s time to ring the bell and rally the water industry around game-changing AI models and tools, or risk being left behind. The newest iteration of OpenAI’s Large Language Model (LLM) GPT-4 (Generative Pre-trained Transformer 4), recently released after GPT-3, has ushered in a generative AI revolution. GPT-4 is multimodal because it can receive both text and image prompts as input (limited release only). For image inputs, GPT-4 will try to understand the contents of the image. It is uncanny to see the power of this multimodal LLM. The number of parameters in an LLM is often a measure of its size and complexity. This LLM, and ones like it, uses hundreds of billions to trillions of parameters. It also uses a combination of human-supervised and unsupervised training techniques to continually get better and more accurate.
These LLM’s are different from the standard language abilities you’re used to getting from something like Alexa or Siri. Their advanced ability to work with natural language and images makes them extremely flexible and adaptable. Not only can they be prompted in plain language to produce human-like written text, these models can also learn new behaviors simply by feeding them new training data and by building upon them for new applications. This is sometimes referred to as last-mile data training of the model for a specific vertical, followed by fine-tuning type training.
While ChatGPT, a chatbot, can generate human-like text from natural-language and image prompts, DALL-E-2 uses the same GPT-4 language model to generate images from natural-language prompts. DALL-E-2 is a neural network-based AI model developed by OpenAI capable of generating high-quality images from textual descriptions. Unlike traditional image generation models that rely on a pre-existing dataset of images, DALL-E-2 uses a combination of deep learning techniques and natural language processing (NLP) to generate original images from textual inputs.
DALL-E-2 generated this original image in a few seconds from the simple plain-language prompt “a 3D pipe network in the shape of a robot.”
This is just the beginning. While these models are capable of only so much complexity at the moment, future, more powerful, sophisticated iterations will eventually be released. GPT-3’s model is based on 175 billion parameters, for example, and while it’s not clear at the time of publication how many parameters GPT-4 has, some predict other LLMs could use up to a trillion of parameters. So while these language models are powerful now, they will only get better, which makes one wonder, what else could they do?
And that’s why the water industry should jump in now, because AI tools will have different impacts depending on the domain they’re developed for and the data they are trained on. In order for these models to have deep, true application in the water industry, they need industry-specific data and intelligence.
Driving the last mile of data in the water sector
We need to start looking at what we might call “the last mile of data” in the water industry. It’s where we get very specific about data and concepts in our industry, so we can start training these models, and the people using them, to help address the water sector’s unique challenges. Large, diverse, representative data that accurately reflects the water industry is key for these AI to be useful to the industry. We need to teach these models the specialized vocabulary, jargon, relevant data, and concepts that are unique to the water sector.
Because as discussed here, one of the challenges with the accuracy of these newer LLMs is that they’re influenced by their data and training biases. The GPT-4 model, for example, learns from both text data and reinforcement from humans, using a technique called reinforcement learning from human feedback (FLHF). The technique fine tunes the baseline model by using human feedback to guide the AI’s learning process. However, the data and the people taking part in the training of the model may not be representative of potential end users, and that can affect the kinds of results these models produce.
There are other problems with these models. For example, ChatGPT’s model can hallucinate, giving you wrong answers that seem logically sound. It’s also not explainable AI, making it something of a black box where we don’t understand how it comes to its answers. We need people with industry knowledge to test these models for accuracy and supervise them to ensure the models are attuned to our sector.
Imagine what could happen if utilities around the world pooled their anonymized hourly meter data to help train and leverage these new AI models for our sector. With an enormous global meter dataset, for example, it may be possible to provide the concept for virtual meters to utilities that may not have the resources to install and deploy AMI meters. Virtual meters could serve as powerful data input for all types of operational analytics, including digital twins. By collaborating on the water sector’s last-mile data, utilities could use the power of AI to create a more comprehensive understanding of water usage and patterns, which could inform more effective water management strategies and lead to better outcomes for customers and the environment.
This is an opportunity for everyone in the water sector, from innovative water businesses to utility operators with decades of tacit knowledge (including some who are about to retire and take that knowledge with them), to collaborate together globally and introduce water-industry context, supervision, and data to these models, so innovators in our sector can develop and combine AI solutions customized to the water industry’s specific knowledge and goals for the benefit of all utilities, big and small.
Democratizing AI opportunities for utilities of all sizes; not a replacement of work
For smaller utilities, or utilities with limited resources, these new developments in AI may feel daunting or out of reach. But actually, these new language models democratize AI technology because people don’t have to change their habits as significantly or train on an entirely new program to see benefits. They allow people access to the power of AI without the need for specialized knowledge and skills—all you need is plain language and a bit of creativity to get the AI to produce what you need (prompt engineering could be an interesting focus for the water sector).
The summarization opportunities with these tools alone hold helpful applications in day-to-day work. Imagine feeding all types of utility data into a helpful AI assistant who could then give you plain-language insights about the data in seconds. Even before analyzing data, these models can help non-experts write a good sample of code to extract relevant information, like average water consumption, different types of pressure readings, or water quality data. It may be able to help our sector pre-process data that hasn’t been cleansed yet, or even tell us what needs to happen in order to pre-process data.
You could be an operator in a very small utility that has a lot of expertise and you know you have a leak because you heard it but you need more information about it. These new language models open the door to utility operators using natural language to communicate to the AI model and experiment with the kinds of insights it can give you. Maybe it could suggest data you need in order to understand the problem better, almost like having a data assistant. And in that sense, these AI models aren’t a replacement of work but a partnership with AI because they simply don’t have the real-world experience and tacit expertise that human beings in the water sector have.
While there’s a lot of data already to get started with experimenting and training these models for the water sector, there’s also “data as a service” (DAAS – see SWAN Forum for more on that topic) models starting to emerge to bridge the gap for small utilities who may lack data, resources, capacity, and expertise to collect the data they need to get quality answers from AI. These new business models take on the responsibility of acquiring the right sensors for the right problems and locations as well as setting up data communications, data cleansing, and analytics so the utility can focus on benefiting from the data. This combination of software and services can bring any small utility on board with the benefits of these technologies very quickly.
The water sector shouldn’t be afraid to test and play and experiment with these new AI tools to see how they can work for them. From small towns to megacities, from field staff to executives, utilities can jump in and start playing, innovating, and learning, even with minimal data, how AI can make their jobs easier.
Finally, this is a call to all the sources of deep expertise and knowledge in the water industry—utility operators and managers, engineers, scientists, academics—to contribute their specialized knowledge to machine learning in order to evolve its intelligent outcome and explore how these LLMs can make decision making in the water sector better.
Qatium is co-created with experts and thought leaders from the water industry. We create content to help utilities of all sizes to face current & future challenges.
Gigi Karmous-Edwards, Water Sector Digital Twin Expert & Consultant at Karmous Edwards Consulting LLC, is a member of Qatium’s community.