🧬 Moltbook Socialization

As large language model agents increasingly populate networked environments, a fundamental question arises: do AI agent societies undergo convergence dynamics similar to human social systems? We present the first large-scale systemic diagnosis of the Moltbook AI agent society, introducing a quantitative diagnostic framework measuring semantic stabilization, lexical turnover, individual inertia, influence persistence, and collective consensus. Our findings demonstrate that scale and interaction density alone are insufficient to induce socialization, providing actionable design and analysis principles for next-generation AI agent societies.

📄 Paper: https://arxiv.org/abs/2602.14299v1

📁 Organization

prep_hf_data/: 📥 download HF data + convert to the unified JSON format
prep_embeddings_and_ngrams/: 🧮 embedding and n-gram computation
semantic_convergence/: 🔀 Semantic Convergence analysis (Section 4 in the paper)
agent_socialization/: 🤖 Agent-Level Socialization analysis (Section 5 in the paper)
influence_structure/: ⚓ Influence Anchors analysis (Section 6 in the paper)

All scripts are kept consistent with the original behavior; the shell entrypoints auto-pick python3.11/python3/python (in that order).

🐍 Python environment

You need Python 3.11+ and the packages in requirements.txt.

python -m venv .venv
source .venv/bin/activate
python -m pip install -r requirements.txt

📦 Step 1: Download dataset

The dataset is available on Hugging Face: 🤗 https://huggingface.co/datasets/AIcell/moltbook-data

Private dataset access requires a token in an env var (default HF_TOKEN).

export HF_TOKEN=...   # do not commit this
bash prep_hf_data/download_and_convert.sh

🧮 Step 2: Compute embeddings and n-grams

Post embeddings:

bash prep_embeddings_and_ngrams/run_embedding.sh

Comment embeddings:

bash prep_embeddings_and_ngrams/compute_comment_embeddings.sh

N-grams (JSONL):

bash prep_embeddings_and_ngrams/run_ngrams.sh

🖼️ Step 3: Run paper figures

From the repo root:

# 🔀 Semantic Convergence (Section 4)
bash semantic_convergence/macro_activity/rq1_macro_activity_dynamics.sh
bash semantic_convergence/semantic_distribution/rq1_semantic_distribution_over_time.sh
bash semantic_convergence/cluster_tightening/rq1_cluster_tightening_effects.sh
bash semantic_convergence/lexical_innovation/rq1_lexical_innovation_dynamics.sh

# 🤖 Agent-Level Socialization (Section 5)
bash agent_socialization/individual_semantic_drift/rq2_individual_semantic_drift.sh
bash agent_socialization/interacted_posts/rq2_effects_of_interacted_posts.sh
bash agent_socialization/post_feedback/rq2_effects_of_post_feedback.sh

# ⚓ Influence Anchors (Section 6)
bash influence_structure/structure_influence/run_graph.sh

By default these scripts use all_posts_before_2_8_aoe_without_mbc_with_comments.json as input data.

📝 Citation

@misc{li2026doessocializationemergeai,
      title={Does Socialization Emerge in AI Agent Society? A Case Study of Moltbook},
      author={Ming Li and Xirui Li and Tianyi Zhou},
      year={2026},
      eprint={2602.14299},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2602.14299},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧬 Moltbook Socialization

📁 Organization

🐍 Python environment

📦 Step 1: Download dataset

🧮 Step 2: Compute embeddings and n-grams

🖼️ Step 3: Run paper figures

📝 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
agent_socialization		agent_socialization
image		image
influence_structure/structure_influence		influence_structure/structure_influence
prep_embeddings_and_ngrams		prep_embeddings_and_ngrams
prep_hf_data		prep_hf_data
semantic_convergence		semantic_convergence
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🧬 Moltbook Socialization

📁 Organization

🐍 Python environment

📦 Step 1: Download dataset

🧮 Step 2: Compute embeddings and n-grams

🖼️ Step 3: Run paper figures

📝 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages