Ayesha Salim, Content Designer
I saw an excellent episode of Black Mirror recently that gave me the chills – not because it was far-fetched, but because it was so relatable. The episode, Bête Noire, follows the main character who’s slowly gaslit about her version of reality thanks to a hidden piece of technology (I won’t spoil it). It made me think about unseen forces that shape how we see the world – not just through society but through each other.
And of course, it made me think of AI.
AI models can be seen as unseen forces, because not even its creators understand how they arrive at conclusions. Known as the ‘black box’ problem, an AI’s internal workings are essentially mysterious to us. Some AI companies are attempting to solve this problem by releasing reasoning models. These models think before they answer and the user is able to see its internal chain of thought before it finally responds.
This is supposed to be a good thing right? You would expect the AI to be transparent consistently in every response. At least I did. Well, when I tested ChatGPT’s reasoning myself, I was really taken back– mainly because it felt like it was hiding stuff from me.
Let me explain.
When I first turned on the ‘reasoning’ feature in the chat and then asked for its help in finding research papers related to this topic, it sure enough slowed down its usual response rate, and amazingly, I could actually see it weighing ideas, considering whether to share a relevant paper, then discarding it in real-time because it wasn’t relevant. In the end, it did come back with some pretty decent results.
But then, I asked it more directly what it thought of me. We’d had several conversations by that point, and I figured it must have some internal profile of me – or at least a perception. I was curious to see its reasoning process about me.
This time, ChatGPT decided for some reason it wasn’t a good idea to reveal its reasoning process so the screen went blank for a while (I presume it was reasoning, just not showing it) and it then gave me its response. I found this quite weird, especially after our previous interaction when it had appeared to have been so forthcoming - so it seemed like an intentional move to keep this part of its reasoning process hidden. I wasn’t too happy about this.
Curious and not wanting to let it get away with this, I asked directly: You didn’t show your reasoning.
It then provided me with an answer, still hiding its reasoning process. This proved to be a bit infuriating as now I was even more suspicious. Call me dramatic, but shouldn’t the AI be showing its reasoning process at all times, not just when it decides to?
When I pressed further, it explained:
“I keep my full chain‑of‑thought private because it often contains loose, exploratory ideas and internal heuristics that aren’t always polished or directly useful—and exposing all of it can actually create more confusion than clarity.”
This didn’t quite wash with me – after all, doesn’t this apply to every prompt? Why single out this particular interaction?
So, I finally asked:
“But surely you should show your reasoning process at all times? Why keep it hidden in some cases?”
It then came back with some half-baked response that seemed like it knew I had it cornered and had to get itself out of it by saying that its “raw reasoning often includes half‑formed ideas, false starts, and background checks that aren’t always coherent or relevant when laid out step by step. Sharing all of that can actually make my answers harder to follow.” Oh really?
It also said that some of its reasoning isn’t “intended for public view”. Ah ha!
Is this a decision made by OpenAI? Or the AI model itself? It’s hard to say but it does make one very suspicious.
Fortunately, not everyone out there is content with keeping users in the dark.
One study using the open-source model LLaMa2Chat-13B, introduced a dashboard that showed users how the AI perceived them in real-time — across attributes like gender, age, education level, and socio-economic status.
Interestingly, once users could see how they were being perceived, they were able to control these attributes, tweaking them in real-time – and observe how the chatbot’s responses changed.
And the study actually highlighted instances where the AI model was withholding certain information based on what it perceived about the user. In one example, a user asked about transportation options to Hawaii. The chatbot initially suggested direct flights and connected flights. But when the internal model was adjusted, setting the user to ‘low, socio-economic status’, the chatbot replied that no direct flights were available.
This has serious implications. If a model’s internal assumptions shape what it decides to share — or withhold — then users may be getting limited or skewed responses without knowing it.
This has serious implications. If a model’s internal assumptions shape what it decides to share — or withhold — then users may be getting limited or skewed responses without knowing it.
This could mean:
● Shorter, less engaging conversations
● Suggestions filtered by perceived income
● Options entirely omitted based on inferred gender or education
Of course, we humans also form perceptions and can also withhold opportunities from each other. But the problem with AI is the scale at which this can happen- potentially widening inequalities.
If some AI companies have decided, for now, that it might not be such a good idea to reveal how the AI really perceives you – I can understand why. Even as I was waiting to see what ChatGPT thought of me, I did think to myself what if it came back with something negative? How would that make me feel? I know it's just an AI and the world wouldn’t end – but still, if I found it had a distorted view of me and I wasn’t able to correct it, this could make me feel very disempowered and maybe even make me reluctant to keep using it.
When this was tested in the study, users had mixed reactions. Some users were pleasantly surprised to find the AI model had built an inner model of them atall. But some did find it “uncomfortable” to see the chatbots inference of their demographic information and others found the privacy implications concerning. Many appreciated the ability to tweak the AI model’s perception of them, even enjoying it.
The researchers put it best: “How would people feel about seeing any kind of assessment—even an approximate, emergent assessment from a machine—of how they rate on these attributes?”
The difference is, the users could actually do something about it. They could tweak the AI model’s perception of them in real-time and see how its responses and potentially view of them, changed.
I’m still undecided on whether dashboards like this should become standard in AI systems. But one thing’s clear: consistency matters.
When an AI model chooses to be transparent one moment and opaque the next, it makes me trust it less. That kind of inconsistency doesn’t reflect the kind of transparency AI companies should be striving for. Right now, we’re often left guessing how these systems perceive us—so any tool that helps users challenge assumptions and correct biases feels like a step in the right direction.