Should I Be Using AI for This?

October 9, 2024

The Nonprofit AI Treasure Map

The current hype cycle around Artificial Intelligence is getting tremendous attention in nonprofits and society at large. There is a great deal of pressure on leaders to “do AI,” with the strong implication that those who don’t are missing the “Biggest Thing Ever.” What’s the non-technical nonprofit leader to do?

Wait and See for AI Products

As a longtime developer of AI products for the social sector, my main advice to you is to wait and see! While I love what AI can enable, it is still just technology. Adopting new technology at a nonprofit is not easy. We shouldn’t forget all of the things we already know about adopting technology for our organizations just because it’s the latest, whizzy innovation. As a matter of fact, it’s a good time to remember all of the basics: design for the people in the system, test out the new tech before committing hundreds of thousands (or millions to it), and make sure that the money invested will be well-spent in terms of cost savings and/or better mission impacts. The latest generation of AI is a bigger than average leap forward for technology, but it is far from a magic wand to solve all of your problems.

I’ve created a Nonprofit AI Treasure Map for the non-technical nonprofit leader. It’s a roadmap to help you make sensible decisions when it comes to adopting AI technology. One of my top pieces of advice is to wait and see: the technology is moving quite fast and something that makes no sense today might work pretty well in one or two years (but also might take one or two decades, if ever!).

Should I be using AI for this - Nonprofit AI Treasure Map

This is a short piece, and so it’s not going to cover everything good and bad about AI technology, of which there is plenty to discuss. It is not a critique of the biased data used to create today’s dominant AI products, which don’t reflect most of humanity’s languages, cultures or reality. It’s not a detailed discussion of the many mistakes AI makes today (although I did write an article on that!). It’s not covering how many AI products are simply bogus, but there is a new book on this topic entitled AI Snake Oil which points out that the field of predictive AI is full of products which basically don’t, and can’t, measure up to the claims being made about them. This piece is also not aimed at nonprofits with their own data teams: if you have the internal capacity to develop AI products, you are going to be more realistic about what AI can and can’t do.

My biggest piece of advice as conveyed by the Treasure Map is to wait for a product. Just like the average dentist or restaurant has no business writing software, the average nonprofit has no business writing software. Much less AI software! It doesn’t go well 99% of the time and wastes a lot of money.

The great thing about products is that they spread the cost of developing technology across a large number of users. AI product developers know about human-centered design, and cybersecurity, and guardrails to minimize problems. In addition, you can ask your peers about their experience with products: did that AI-driven fund raising tools really raise additional money far in excess of the product cost, or not? Did it end up driving away more donors than it attracted? Plus, it’s usually quite easy to test out a product on your needs, and see if it measures up against your actual use cases.

The Admin Zone: Best Place to Start with AI

The Treasure Map is divided into two zones. The admin zone starts in the upper left quadrant of the map, and focuses on internal activities like writing, editing and fund raising. Because so many organizations do these activities, there are already affordable AI products out there. Just like everyone uses spell and grammar checkers, AI tools of an earlier generation, many writers and editors will get value out of generative AI tools which work with text, like ChatGPT and its competitors. In a similar way, there are also products for fund raisers and software developers which are accessible and easy to try out.

The most important point about these uses is having human beings who know their stuff reviewing the output of these tools. Generative AI, by definition, makes stuff up based on what it’s been trained on. The AI tools don’t actually think, or understand, their output, but do an amazing job of seeming like they do. The simulation of thinking trips up many people who believe that what comes out of an AI tool is true, even when it is not. The people who get in the most trouble with generative AI are those who do not know the subject area to spot when the AI gets it wrong. Most nonprofit leaders should be cautious about introducing a technology with the ability to create errors and chaos, unless there’s an active plan to manage those mistakes.

These limitations were pointed out in a paper by AI experts in 2021, “On the Dangers of Stochastic Parrots,” where they pointed out that the large language models behind today’s generative AI products were like parrots with an uncanny ability to mimic human speech without actually understanding it. Our treasure map features a pair of robot parrots in honor of this insightful technical paper.

AI parrots of knowledge

The Program Zone: Be Careful with AI

Today, there are few existing AI products to meet the needs of nonprofit programs, which is the subject of the bottom half of the treasure map. This makes applying AI much more challenging. First, nonprofits take their responsibilities to the communities they serve very seriously. A mistake in a fund-raising message may lead to less funds raised, but a mistake in program might endanger the life of a client. The literature is full of stories of government agencies (and insurance companies) recklessly deploying AI technology which did tremendous damage to disadvantaged people, far exceeding what would have happened if we left people in charge of these tasks. We already have examples of generative AI chatbots telling people to do the opposite of what they should do, including one chatbot which told someone to kill themselves (and they reportedly did). Nonprofit leaders owe it to their clients to evaluate the application of AI to program activities very carefully, to ensure that the benefits are strong and the downsides are managed. Nonprofits have an ethical commitment to those they serve.

Most organizations are not ready to embark on an expensive effort to deploy AI technology in their programs. First, AI is only worth considering if your organization already has large quantities of data and is skilled using that data to better manage programs. Applying AI is an advanced skill: you need the prerequisites, or you are highly likely to fail and not understand why. Next, like any other tech investment, you need to be able to make the financial case. Investing six or seven figures in an AI project, which is typical, needs to deliver outstanding value in terms of cost savings, lowering the cost to serve clients, or greatly improving the mission impact.

Of course, if there is already a product in your field which you can trial, or if a handful or more organizations are coming together to build an open source platform, your risks become much more manageable. For example, my nonprofit organization works with more than fifteen national child helplines who use our open source Aselo contact center platform, and we are building AI tools in partnership with a handful of these organizations to better classify and summarize crisis chat conversations, while protecting the sensitive data of the children reaching out for help. This project was beyond the means of most of these helplines, but could be justified to donors because it would be available to many.

Finally, if you are still moving forward with developing your own custom AI project, you should expect to hire at least two or three expensive technologists and maintain that team for years. Custom AI technology is never “one and done.” The cost to train, update, and maintain a custom AI tools generally exceeds the initial development cost. This is also true when you outsource the development to an external tech agency. There are probably less than one hundred nonprofits building custom AI products today which are likely to succeed, and they are all spending significant sums of money (and targeting large scale impact to justify those investments).

One last item: the deployment of an AI tool that doesn’t work can be damaging to your organization. The National Eating Disorders Association in the U.S. fired its human counseling staff, and deployed a generative AI chatbot, Tessa. Tessa was caught giving the wrong advice (the opposite of what you are supposed to tell someone with an eating disorder) within a week and ended up being shut down. Beyond the money wasted on technology development, you don’t want to be in the news for having deployed technology which did the wrong thing by your stakeholders.

Pilot First!

The upper right quadrant covers the actual deployment of AI in your organization. The first thing to consider is piloting the technology. If there’s already a product, this should be easy to do. Are your grant proposals better written and done in less time using generative AI text tools? Or do they seem bland and contain errors not being caught by your team? As above, are you raising more money while keeping donors happy?

If there is no existing product to test, you might try to simulate it first with humans. Sometimes this is straightforward. For example, if you operate an advice line and you believe you can handle the 40% most routine questions with AI delivering approved canned answers, AI will seem plausible as long as its error rate (giving the wrong answer) is low. In more challenging cases, I have seen people test a planned automated solution by having people pretend to be the automated solution. If users hate it when humans are simulating the answers from an AI platform, it’s probably not going to become more loved once the AI solution is built!

Piloting can also be eye-opening about what needs to be true in terms of capabilities and performance to make a viable project. Just because a pilot fails today, doesn’t mean the technology could improve over the next couple of years and become successful then.

Quality Control, or Why Not a Human in the Loop?

AI tools make a lot of mistakes. Does your phone (or home speaker) understand what you say perfectly every time? We have become used to working around these errors in our day-to-day interactions with consumer-grade AI tools like Siri, Alexa and Google. What are the workarounds going to be for your staff, or even more concerning, your donors, clients and stakeholders?

For high-stakes applications, you need to invest in quality control. Are your expert staff members going over the outputs and reviewing them to ensure major errors are caught and corrected? When your clients report that they didn’t get the answer they were seeking, are you actively responding to them and improving the tech tool? The more human hours which go into quality control, the less dramatic the payoff of the AI project. It can easily be the case that the AI tool does not save money while delivering the same or better quality. You should be sure to understand these dynamics from the beginning, so you can stop a project mid-way, which is unlikely to be a success. Quite a number of today’s successful AI projects had to go another round of development to get performance into the acceptable range.

There are also applications where you can make an informed decision about deploying an AI tool in spite of its limitations. This takes a lot of data savvy, but could work out where the stakes are not life or death. For example, Khan Academy’s Khanmigo personal tutor doesn’t have to be perfect to be useful. If a student gets a tutor which is close to as good as a human tutor (who are never perfect) and ends up learning more effectively, that might be a good tradeoff. Khan Academy’s team is full of tech people who can make these decisions, unlike the typical community-based organization.

Conclusion

I hope you find the AI Treasure Map to be a useful tool as you chart your course into the exciting AI future. The current generation of AI tools is going to find their niches in the social sector over the coming years. The best source of unvarnished input on what does and doesn’t work will continue to be your peers. A trusted peer who is getting outstanding value out of new technology is worth emulating. Another peer who didn’t find something worthwhile is a source of insight and increased caution.

Best wishes in finding your bits of AI treasure in the coming years!