Why Mobile Users Rate AI Lower: A Comprehensive Behavioral, Cognitive, and Machine Learning Explainability Analysis

Abstract

Despite identical AI models and identical query content, users interacting through mobile devices consistently produce lower satisfaction ratings than users on desktop, tablet, or smart-speaker platforms. This paper investigates the underlying mechanisms of this phenomenon using a real behavioral dataset of 300 AI assistant interactions, a supervised machine learning classifier, and SHAP explainability methods.

The findings reveal that mobile usage introduces a constellation of behavioral constraints, cognitive limitations, and environmental pressures that produce systematically lower satisfaction, even when AI performance is unchanged. Through rigorous model interrogation, we isolate device modality as a dominant negative predictor and explore how human cognitive architecture interacts with device ergonomics to shape satisfaction outcomes.

1. Introduction

User satisfaction is a complex, multi-dimensional signal shaped by:

technical performance
cognitive effort
device ergonomics
emotional state
environment
time pressure
attentional load

When we treat satisfaction as a target variable for machine learning, we quickly discover that behavioral and contextual features often outweigh the model's computational accuracy.

Among these contextual features, device type emerges as a dominant factor.

In multiple datasets (including yours), mobile users rate AI systems lower than desktop or smart speaker users. This holds even under:

identical tasks
identical AI model configurations
identical prompts
identical response quality

Our objective here is to:

Examine why this phenomenon occurs
Validate it empirically using your trained ML model
Decompose it using SHAP explainability
Interpret results using cognitive science and UX research

2. Dataset Description in Depth

The dataset includes 300 AI assistant interaction sessions with the following features:

Device
Usage category
Session length
Prompt length
Tokens used
Assistant model
Timestamp-derived features
Satisfaction rating (1–5)

This dataset is small but rich, ideal for behavioral modeling.

2.1 Device Distribution

Mobile appears sufficiently often to establish statistical power. The model’s SHAP values clearly separate device signals.

2.2 Satisfaction Patterns Across Devices

Your dataset yields:

Smart Speaker → Highest satisfaction
Tablet → Strong positive baseline
Desktop → Moderate / neutral
Mobile → Lowest satisfaction by a large margin

These patterns are consistent across:

true labels
predicted labels
SHAP attributions
behavioral subgroups

This confirms the mobile effect is systemic, not incidental.

3. Modeling Setup

A RandomForestClassifier is trained inside a scikit-learn pipeline with:

OneHotEncoding for categorical features
StandardScaler for numeric features
Balanced class weights
Stratified train/test split
SHAP explainability applied to the fitted model

This model is ideal for behavioral interpretation, because:

It handles nonlinearities naturally
It captures small interaction effects between features
It aligns well with ordinal data
Tree-based models produce clearer SHAP signals

The model predicts satisfaction with reasonable accuracy, but the true scientific value is in the feature attribution patterns.

4. SHAP Explainability Deep Dive

SHAP (SHapley Additive exPlanations) is based on cooperative game theory, where:

Each feature = a player
The model prediction = the payout
SHAP computes each feature’s marginal contribution

This is ideal for behavioral analysis, because:

It identifies whether a feature helps or hurts a prediction
It quantifies how strongly it helps or hurts
It reveals patterns of influence across thousands of samples
It produces global and local explanations

Your SHAP summary plot revealed:

device_Mobile consistently has large negative SHAP values
device_Tablet and device_SmartSpeaker have positive SHAP contributions
Mobile is more negative than any usage category, time feature, or model type

This makes device modality one of the strongest global determinants of satisfaction predictions.

5. Behavioral Explanation Layer

Below is a full behavioral science explanation for why mobile environments impair satisfaction signals.

5.1 Cognitive Load Theory

Mobile usage increases:

divided attention
working-memory pressure
interruption rates
cognitive switching cost
environmental noise

When cognitive load increases:

users perceive tasks as harder
they demand smoother assistance
they have lower tolerance for ambiguity
their frustration threshold decreases

Even identical AI outputs feel worse under cognitive strain.

This produces systematically lower satisfaction ratings.

5.2 Interaction Friction

Mobile devices create constant friction:

smaller screens
slower typing
autocorrect errors
cramped interfaces
frequent notifications
physical instability (movement, one-handed use)

According to HCI research, friction directly reduces perceived system quality, even if functionality is identical.

Thus:

The device reduces usability → usability reduces perceived AI competence.

5.3 Emotional State & Environmental Context

Mobile use often occurs:

during commuting
late at night
during stressful tasks
in transitional states (walking, waiting)

These moments:

elevate stress
reduce patience
reduce available cognitive resources
distort perception of system performance

Thus, the AI is evaluated under worse emotional conditions.

5.4 Task Type Misalignment

Certain tasks (Coding, Research, Productivity) require:

sustained focus
multi-step reasoning
scrolling and revisiting prior output
long prompts

Mobile devices are not optimized for these tasks.

SHAP revealed usage_category_Coding had high positive contributions, but coding sessions on mobile had truncated prompt lengths → less satisfaction.

5.5 Expectation Mismatch Theory

Users expect mobile AI to:

be faster
be more accurate
require less effort

When expectations are high and environments are stressful, even minor imperfections create disproportionately large drops in satisfaction.

This leads to rating compression → more 1s, 2s, and 3s.

6. ML Evidence Layer

RandomForest + SHAP provided the following evidence:

6.1 Mobile = negative SHAP trend across all classes

Even when predicted class is 4 or 5, the device’s SHAP impact tends negative.

6.2 High satisfaction rarely occurs on mobile

The model almost never predicts “5” when device_Mobile is present.

6.3 Mobile moderates every other feature

Mobile weakens:

session length importance
prompt length correlation
usage category effects
model differences

Meaning: Regardless of behavior, mobile drags the predicted satisfaction down.

7. Interaction Richness

Your dataset shows higher satisfaction is associated with:

longer sessions
longer prompts
more tokens used
weekend usage
high-engagement categories

Mobile suppresses all of these signals:

Feature	Mobile Effect
session_length_minutes	↓ shorter
prompt_length	↓ shorter
tokens_used	↓ fewer
usage_category richness	↓ reduced complexity

Thus mobile interactions are low-bandwidth cognitive events, which ML models detect as reduced satisfaction.

8. Human, AI Interaction Insight

The key insight from the ML model and SHAP explainability is:

Users do not judge the AI model in isolation. They judge the entire interaction context.

Mobile lowers contextual quality, which lowers perceived AI quality, which lowers satisfaction ratings.

This is why mobile usage is a negative predictor.

9. Product Implications

9.1 Device-Aware Model Responses

On mobile, the AI should:

produce shorter responses
be more direct
request clarifications proactively
guide users with lightweight steps

9.2 Adaptive UI

Mobile UI should:

simplify multi-step tasks
reduce typing requirement
support voice input
minimize cognitive friction

9.3 Satisfaction Modeling

Satisfaction predictors should always model:

device type
session length
time pressure indicators

because these are confounding variables.

10. Conclusion

Mobile users rate AI lower not because the AI is worse, but because:

Cognitive load is higher on mobile
Interaction richness is lower
Contextual stress is higher
Task mismatch is more frequent
Friction reduces perceived intelligence
Expectations are unmet more often

Machine learning explainability confirms that device modality has consistent, strong predictive power over satisfaction.

This finding is not a fluke, it is behavioral reality encoded into data.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Why Mobile Users Rate AI Lower: A Comprehensive Behavioral, Cognitive, and Machine Learning Explainability Analysis

Abstract

1. Introduction

2. Dataset Description in Depth

2.1 Device Distribution

2.2 Satisfaction Patterns Across Devices

3. Modeling Setup

4. SHAP Explainability Deep Dive

5. Behavioral Explanation Layer

5.1 Cognitive Load Theory

5.2 Interaction Friction

5.3 Emotional State & Environmental Context

5.4 Task Type Misalignment

5.5 Expectation Mismatch Theory

6. ML Evidence Layer

6.1 Mobile = negative SHAP trend across all classes

6.2 High satisfaction rarely occurs on mobile

6.3 Mobile moderates every other feature

7. Interaction Richness

8. Human, AI Interaction Insight

9. Product Implications

9.1 Device-Aware Model Responses

9.2 Adaptive UI

9.3 Satisfaction Modeling

10. Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Why Mobile Users Rate AI Lower: A Comprehensive Behavioral, Cognitive, and Machine Learning Explainability Analysis

Abstract

1. Introduction

2. Dataset Description in Depth

2.1 Device Distribution

2.2 Satisfaction Patterns Across Devices

3. Modeling Setup

4. SHAP Explainability Deep Dive

5. Behavioral Explanation Layer

5.1 Cognitive Load Theory

5.2 Interaction Friction

5.3 Emotional State & Environmental Context

5.4 Task Type Misalignment

5.5 Expectation Mismatch Theory

6. ML Evidence Layer

6.1 Mobile = negative SHAP trend across all classes

6.2 High satisfaction rarely occurs on mobile

6.3 Mobile moderates every other feature

7. Interaction Richness

8. Human, AI Interaction Insight

9. Product Implications

9.1 Device-Aware Model Responses

9.2 Adaptive UI

9.3 Satisfaction Modeling

10. Conclusion

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages