Multimodal Co-Attention Transformer for Survival Prediction in Gigapixel Whole Slide Images - ICCV 2021
-
Updated
Mar 11, 2022 - Jupyter Notebook
Multimodal Co-Attention Transformer for Survival Prediction in Gigapixel Whole Slide Images - ICCV 2021
Visual Fusion of Camera and LiDAR Sensor
This study introduces MultiBanFakeDetect, a novel multimodal dataset for Bangla fake news detection, combining textual and visual information. It features TextFakeNet for text analysis and MultiFusionFake for integrating multimodal data.
Early Fusion, Late Fusion, and Hybrid Fusion
Lightweight multi-frame integration for YOLO (ECMR 2025 paper)
本项目基于Tiktok, Instagram, 快手等流媒推荐系统的 OneRec 生成式推荐范式架构,探索推荐系统统一建模为序列生成与偏好对齐问题,通过 Semantic ID 实现多模态内容与用户行为的统一表示,并结合强化学习驱动的 Reward 机制进行端到端优化。在此基础上,引入生成式建模与上下文感知能力,实现从传统判别式推荐向生成式、可推理推荐系统的范式跃迁
Streamlit app for demonstrating multi-modal(vision+language) modelling in Pytorch.
Multimodal AI from scratch: RGB + LiDAR sensor fusion, CLIP-style contrastive pre-training, and cross-modal projection using PyTorch. / RGB 카메라와 LiDAR 센서 데이터를 활용한 멀티모달 AI 구현 — 조기·후기·중간 융합 비교, CLIP 스타일 대조학습, 크로스모달 프로젝션.
Early-fusion multimodal machine learning for emotion classification from social media videos (visual, audio, text). Portfolio project from SATRIA DATA 2025
Low-resource multimodal hate speech detection leveraging acoustic and textual representations for robust moderation in Telugu.
🌟 Enhance YOLOv7 with multi-frame detection for improved robustness against blur and occlusion, using efficient weak supervision and minimal model changes.
Add a description, image, and links to the early-fusion topic page so that developers can more easily learn about it.
To associate your repository with the early-fusion topic, visit your repo's landing page and select "manage topics."