Large AI Model Empowered Multimodal Semantic Communications

Authors

Li Dong, Feibo Jiang, Yubo Peng, Kezhi Wang, Kun Yang, Cunhua Pan, Xiaohu You

Paper

https://arxiv.org/abs/2309.01249

Code

https://github.com/jiangfeibo/LAMMSC.git

Abstract

Multimodal signals, including text, audio, image, and video, can be integrated into Semantic Communication (SC) system to provide an immersive experience with low latency and high quality at the semantic level. However, the multimodal SC has several challenges, including data heterogeneity, semantic ambiguity, and signal distortion during transmission. Recent advancements in large AI models, particularly in the Multimodal Language Model (MLM) and Large Language Model (LLM), offer potential solutions for addressing these issues. To this end, we propose a Large AI Model-based Multimodal SC (LAM-MSC) framework, where we first present the MLM-based Multimodal Alignment (MMA) that utilizes the MLM to enable the transformation between multimodal and unimodal data while preserving semantic consistency. Then, a personalized LLM-based Knowledge Base (LKB) is proposed, which allows users to perform personalized semantic extraction or recovery through the LLM. This effectively addresses the semantic ambiguity. Finally, we apply the Conditional Generative adversarial networks-based channel Estimation (CGE) for estimating the wireless channel state information. This approach effectively mitigates the impact of fading channels in SC. Finally, we conduct simulations that demonstrate the superior performance of the LAM-MSC framework.

The function of each file

LAM-MSC.py: Overview of the LAM-MSC framework.
channel_nets.py: Definition of the channel encoder, channel decoder, and physical channel.
MMA.py: The implementation of the MMA module, including the modal transformation and recovery.
LKB.py: The implementation of the LKB module.
CGE.py: The implementation of the CGE module, including the network definition and training of CGAN.
SCwithCGE.py: The implementation of the image SC and CGE modules, including the training of the SC model.
CoDi: The implementation of CoDi. The details refer to https://github.com/microsoft/i-Code/.
logs: Path to save the logs during training.
checkpoints: Path to save model weights.

Citation

@article{jiang2023large,
  title={Large AI model empowered multimodal semantic communications},
  author={Jiang, Feibo and Peng, Yubo and Dong, Li and Wang, Kezhi and Yang, Kun and Pan, Cunhua and You, Xiaohu},
  journal={arXiv preprint arXiv:2309.01249},
  year={2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Large AI Model Empowered Multimodal Semantic Communications

Authors

Li Dong, Feibo Jiang, Yubo Peng, Kezhi Wang, Kun Yang, Cunhua Pan, Xiaohu You

Paper

https://arxiv.org/abs/2309.01249

Code

https://github.com/jiangfeibo/LAMMSC.git

Abstract

The function of each file

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
CoDi		CoDi
checkpoints		checkpoints
logs		logs
CGE.py		CGE.py
LAM-MSC.png		LAM-MSC.png
LAM-MSC.py		LAM-MSC.py
LKB.py		LKB.py
MMA.py		MMA.py
README.md		README.md
SCwithCGE.py		SCwithCGE.py
channel_nets.py		channel_nets.py
eval.py		eval.py

Folders and files

Latest commit

History

Repository files navigation

Large AI Model Empowered Multimodal Semantic Communications

Authors

Li Dong, Feibo Jiang, Yubo Peng, Kezhi Wang, Kun Yang, Cunhua Pan, Xiaohu You

Paper

https://arxiv.org/abs/2309.01249

Code

https://github.com/jiangfeibo/LAMMSC.git

Abstract

The function of each file

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages