Episode 7 (S3): When AI Fails Silently: What Healthcare Leaders and Accreditors Must Confront with Rich Greenhill

You Never Step in the Same River Twice: What AI Is Getting Wrong (and Why It Matters for Higher Ed)

Another great episode of EdUp Accreditation Insights that I had the chance to be part of, and one idea from the conversation that I haven’t been able to shake……

Our guest, Richard Greenhill, shared an old philosophical line: you never step in the same river twice. The water is always moving. At first, it feels like a simple metaphor. But the more I have sat with it, the more I have come to believe it may be one of the most important ways to understand what is happening with artificial intelligence right now.

Because we are treating AI like it exists in a fixed environment. It does not. It operates in a stream of data that is constantly shifting, evolving, and changing shape. That has implications far beyond the technical side of AI. It gets to the heart of how we are using it, how we are teaching it, and how much we should trust it.

Before going further, it is worth naming what this conversation is not about. This is not about AI and academic integrity. It is not about students using tools to write papers or concerns about cheating. Those conversations are important, but they are also the most visible and, in many ways, the most obvious. What we are talking about here is something different and, arguably, more consequential: the increasing use of AI by institutions themselves. The systems we are adopting to make decisions, guide strategy, and shape the student experience.

One of the clearest points from the conversation was this: AI models are not static tools. They are built on data at a given moment in time, validated against that moment, and then deployed into an environment that immediately begins to change. What worked at 90 or 95 percent accuracy when the model was built will not stay there indefinitely. The data shifts. The patterns shift. The underlying conditions shift. And unless someone is actively monitoring and recalibrating that system, its performance will drift.

That is where the real risk begins.

In higher education, we are particularly prone to thinking in terms of implementation rather than maintenance. We launch initiatives. We adopt platforms. We integrate tools. We celebrate progress. Then we move on to the next priority. That approach works reasonably well for many traditional systems. It does not work well for AI.

AI is not a “set it and forget it” technology. It requires continuous oversight. It requires asking whether the model still reflects reality. It requires a willingness to revisit assumptions that felt solid just a few months earlier. Without that, we are not just using outdated tools. We are making decisions based on outdated representations of reality.

Dr. Greenhill described this problem using a term that I have not been able to shake: silent performance failure. It refers to what happens when an AI system continues to produce outputs that appear normal, while its actual performance has already drifted from the conditions under which it was built. There are no alarms. No obvious signs that something is wrong. The system looks like it is functioning as expected. But under the surface, it is no longer aligned with the data environment it is operating in.

That is what makes it dangerous. Not failure that is visible, but failure that is invisible.

It is easy to hear that and assume this is primarily a healthcare issue, where the stakes are life and death. And certainly, that is where many of the examples originate. But it would be a mistake to think this stops there. Higher education is already embedding AI into enrollment modeling, student success initiatives, advising systems, financial aid optimization, and online learning environments. In each of those areas, we are relying on models to interpret patterns and guide decisions.

Consider a simple, but very real, example. Imagine a university implements an AI-driven student success platform designed to identify students at risk of not returning for their second year. The model is trained on several years of historical data, including GPA trends, course engagement, financial indicators, and demographic patterns. At launch, it performs well. Advisors begin using it to prioritize outreach. Resources are allocated based on its predictions.

But over the next two years, several things change. The institution expands online offerings. A new population of adult learners enrolls. Financial aid policies shift. Course delivery formats evolve. Student behavior, particularly around engagement in LMS platforms, begins to look very different than it did in the pre-model data.

The model, however, has not been meaningfully recalibrated.

It continues to flag students based on patterns that no longer fully reflect the current student body. Some students who are genuinely at risk are missed because their behaviors do not match historical signals. Others are flagged unnecessarily because the model interprets new forms of engagement as disengagement. Advisors trust the system because it has “worked” before. The dashboard still looks clean. The outputs still look reasonable.

Meanwhile, retention efforts are being misdirected.

No alarms go off. There is no clear moment of failure. But over time, the institution begins to see uneven outcomes. Certain student populations are underserved. Resources are not reaching the students who need them most. Leadership may even question the effectiveness of advising or student success initiatives, without realizing that the underlying model guiding those efforts has quietly drifted.

That is a silent performance failure in a higher education context.

This raises a deeper concern, which is not about whether AI will fail, but about how convincingly it can fail. These systems are persuasive because they reflect patterns. They produce outputs that look reasonable, structured, and confident. But they are not thinking systems. They are statistical ones. If the underlying data shifts, the outputs may still look right even when they are no longer accurate.

This is where higher education has a responsibility that goes beyond simply adopting AI tools or creating new programs. We need to think more carefully about what we are actually teaching students to do in an AI-enabled world. Right now, much of the conversation focuses on access and usage. Can students use these tools? Are they familiar with them? Do they understand prompt engineering?

Those are useful skills, but they are not sufficient.

What we need to develop more intentionally are the habits of mind that allow someone to question, interpret, and challenge what AI produces. That starts with a basic understanding of how data works. Not at the level of a data science degree, but at the level of recognizing that data is not static, that distributions shift, and that models degrade over time. It also requires a return to something that higher education has always claimed as part of its core mission: teaching people how to think critically, how to ask good questions, and how to evaluate evidence.

In many ways, this is less about adding something new and more about reclaiming something we have allowed to erode.

There is also a growing concern around domain knowledge. If students rely too heavily on AI to generate work without developing the underlying expertise, they lose the ability to evaluate whether what they are seeing is correct. Without that grounding, it becomes very difficult to identify when a system is hallucinating, when it is oversimplifying, or when it is simply wrong. That is not an argument against AI. It is an argument for ensuring that AI does not replace the process of learning itself.

As if that were not enough, we are now beginning to see the rise of agentic AI—systems that do not just generate outputs, but act, adapt, and learn from their environment. These systems introduce another layer of complexity. If they are learning from outputs that have already drifted, they do not just replicate errors. They amplify them. What begins as a small misalignment can scale quickly into a much larger problem.

All of this brings us to a place that feels, at times, unsettled. Accreditation bodies are beginning to engage with these questions. Frameworks are emerging. There are conversations about governance, oversight, and standards. But we are still early in that process, and there is not yet a consistent approach across institutions or sectors.

So for now, much of the responsibility sits with institutional leaders, faculty, and those designing and implementing these systems. We have to approach AI with a level of humility and discipline that matches its complexity. That means recognizing that it is a tool, not a solution. It means keeping humans actively involved in interpreting and validating outputs. And it means building processes that assume change rather than stability.

The river is moving. It always has been. AI has simply made that movement more visible and more consequential.

If we continue to treat our systems, our data, and our decisions as if they exist in a fixed state, we are going to find ourselves making choices based on where things used to be rather than where they are. And in a moment when higher education is already navigating significant change, that is a risk we cannot afford to ignore.

The challenge ahead is not just learning how to use AI. It is learning how to live with it in a way that is thoughtful, critical, and grounded in reality as it exists now, not as it existed when the model was first built.