Skip to content

liuqidong07/Awesome-Multimodal-Recommender-Systems

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Multimodal Recommender Systems: A Survey

A collection of resources and papers of Multimodal Recommender Systems (MRS).

🔥🔥 We will update the repo sustainably!

1. Our Survey

In our survey, we conclude the general MRS as an unified process, including Raw Feature Representation, Feature Interaction and Recommend Model. To face with the challenges contained in each procedure, we classify the existing works according to four branch of techniques, i.e., Modality Encoder, Feature Interaction, Feature Enhancement and Optimization.

More details can be seen in our survey.

2. Open-sourced Repositories

There are two open-sourced repositories for implementing multimodal recommender system models.

MMRec: A PyTorch benchmark, which implements 15 most recent MRS models.

Cornec: A PyTorch framework, which implements more earlier MRS model.

3. Datasets

Data Field Modality Scale link
Tiktok Micro-video V,T,M,A 726K+ link
Kwai Micro-video V,T,M 1M+ link
Movielens+IMDB Movie V,T 100K-25M link
Douban Movie, Book, Music V,T 1M+ link
Yelp POI V,T,POI 1M+ link
Amazon E-commerce V,T 100M+ link
Book-Crossings Book V,T 1M+ link
Amazon Books Book V,T 3M link
Amazon Fashion Fashion V,T 1M link
POG Fashion V,T 1M+ link
TMall Fashion V,T 8M+ link
Taobao Fashion V,T 1M+ link
Tianchi News News T 3M+ link
MIND News V,T 15M+ link
Last.FM Music V,T,A 186K+ link
MSD Music T,A 48M+ link

Note: ’V’, ’T’, ’M’, ’A’ indicate the visual data, textual data, video data and acoustic data, respectively

4. Paper List

Name Paper Feature Interaction Feature Enhancement Optimization Venue Code
MARank Multi-order attentive ranking model for sequential recommendation Combined Attention None End-to-end AAAI'19 link
SAERS Explainable Fashion Recommendation: A Semantic Attribute Region Guided Approach Fine-grained Attention None End-to-end IJCAI'19 N/A
MKR Multi-Task Feature Learning for Knowledge Graph Enhanced Recommendation Knowledge Graph None End-to-end WWW'19 link
UVCAN User-Video Co-Attention Network for Personalized Micro-video Recommendation Coarse-grained Attention None End-to-end WWW'19 N/A
VECF Personalized Fashion Recommendation with Visual Explanations based on Multimodal Attention Network  Fine-grained Attention None End-to-end SIGIR'19 N/A
NRPA NRPA: Neural Recommendation with Personalized Attention  Combined Attention None End-to-end SIGIR'19 N/A
POG POG: Personalized Outfit Generation for Fashion Recommendation at Alibaba iFashion  Fine-grained Attention None Two-step KDD'19 link
MMGCN MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-video  User-item Graph None End-to-end MM'19 link
Learning Disentangled Representations for Recommendation  Other Fusion DRL End-to-end NIPS'19 link
NOR Explainable Outfit Recommendation with Joint Outfit Matching and Comment Generation  Fine-grained Attention None End-to-end TKDE'19 N/A
IRIS Interest-Related Item Similarity Model Based on Multimodal Data for Top-N Recommendation  Other Fusion None End-to-end Access'19 N/A
BGCN Bundle Recommendation with Graph Convolutional Networks  Item-item Graph CL End-to-end SIGIR'20 link
DICER Content-Collaborative Disentanglement Representation Learning for Enhanced Recommendation  Other Fusion DRL End-to-end RecSys'20 N/A
GRCN Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback  Filtration None End-to-end MM'20 link
MKGAT Multi-modal Knowledge Graphs for Recommender Systems  Knowledge Graph + Filtration None End-to-end CIKM'20 N/A
MGAT MGAT: Multimodal Graph Attention Network for Recommendation  User-item Graph + Fine-gained Attention None End-to-end IPM'20 N/A
SI-MKR An Enhanced Multi-Modal Recommendation Based on Alternate Training With Knowledge Graph Representation  Knowledge Graph None End-to-end Access'20 N/A
NOVA Noninvasive self-attention for side information fusion in sequential recommendation  Combined Attention None End-to-end AAAI'21 N/A
LATTICE Mining Latent Structures for Multimedia Recommendation  Item-item Graph None End-to-end MM'21 link
PMGT Pre-training Graph Transformer with Multimodal Side Information for Recommendation  Item-item Graph + Fine-gained Attention None Two-step MM'21 N/A
VICTOR Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training  Fine-grained Attention CL Two-step MM'21 N/A
CDR Curriculum Disentangled Recommendation with Noisy Multi-feedback  Other Fusion DRL End-to-end NIPS'21 link
MDR Multimodal Disentangled Representation for Recommendation  Other Fusion DRL End-to-end ICME'21 N/A
CMBF CMBF: Cross-Modal-Based Fusion Recommendation Algorithm  Coarse-grained Attention None End-to-end Sensor'21 N/A
DualGNN DualGNN: Dual Graph Neural Network for Multimedia Recommendation  User-item Graph None End-to-end TMM'21 link
UMPR Recommendation by Users’ Multimodal Preferences for Smart City Applications  Other Fusion None End-to-end TII'21 N/A
Multi-Modal Contrastive Pre-training for Recommendation Coarse-grained Attention CL End-to-end ICMR'22 N/A
PAMD Modality Matches Modality: Pretraining Modality-Disentangled Item Representations for Recommendation Fine-gained Attention DRL End-to-end WWW'22 link
SimGCL Are Graph Augmentations Necessary? Simple Graph Contrastive Learning for Recommendation User-item Graph CL End-to-end SIGIR'22 link
GHMFC Multimodal Entity Linking with Gated Hierarchical Fusion and Contrastive Training Knowledge Graph CL End-to-end SIGIR'22 link
MKGformer Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion Knowledge Graph + Fine-gained Attention None End-to-end SIGIR'22 link
MMGCL Multi-modal Graph Contrastive Learning for Micro-video Recommendation User-item Graph CL End-to-end SIGIR'22 N/A
MM-Rec MM-Rec: Multimodal News Recommendation Fine-grained Attention None End-to-end SIGIR'22 N/A
CrossCBR CrossCBR: Cross-view Contrastive Learning for Bundle Recommendation Item-item Graph CL End-to-end KDD'22 link
Combo Combo-Fashion: Fashion Clothes Matching CTR Prediction with Item History Fine-grained Attention CL End-to-end KDD'22 link
HCGCN Learning Hybrid Behavior Patterns for Multimedia Recommendation Item-item Graph None End-to-end MM'22 N/A
CKGC Cross-modal Knowledge Graph Contrastive Learning for Machine Learning Method Recommendation Knowledge Graph CL End-to-end MM'22 N/A
MML Multimodal Meta-Learning for Cold-Start Sequential Recommendation  Coarse-grained Attention None Two-step CIKM'22 link 
MARIO MARIO: Modality-Aware Attention and Modality-Preserving Decoders for Multimedia Recommendation  Fine-grained Attention None End-to-end CIKM'22 N/A
MARIO MARIO: Modality-Aware Attention and Modality-Preserving Decoders for Multimedia Recommendation  Fine-grained Attention None End-to-end CIKM'22 N/A
MMKGV Multi-modal Graph Attention Network for Video Recommendation  Knowledge Graph + Fine-gained Attention None End-to-end CCET'22 N/A
TESM A two-stage embedding model for recommendation with multimodal auxiliary information  User-item Graph + Fine-gained Attention None Two-step IS'22 N/A
MICRO Latent Structure Mining With Contrastive Modality Fusion for Multimedia Recommendation  Item-item Graph CL End-to-end TKDE'22 link
DMRL Disentangled Multimodal Representation Learning for Recommendation  Fine-grained Attention DRL End-to-end TMM'22 link
Implicit semantic-based personalized micro-videos recommendation  Fine-grained Attention None End-to-end arXiv'22 N/A
VLSNR VLSNR:Vision-Linguistics Coordination Time Sequence-aware News Recommendation Combined Attention None End-to-end arXiv'22 link
BM3 Bootstrap Latent Representations for Multi-modal Recommendation User-item Graph + Other Fusion CL End-to-end WWW'23 link
MMMLP MMMLP: Multi-modal Multilayer Perceptron for Sequential Recommendations Other Fusion None End-to-end WWW'23 link
MMSSL Multi-Modal Self-Supervised Learning for Recommendation User-item Graph + Coarse-grained Attention CL End-to-end WWW'23 link
TMFUN Attention-guided Multi-step Fusion: A Hierarchical Fusion Network for Multimodal Recommendation Item-item Graph + Coarse-grained Attention CL End-to-end SIGIR'23 N/A
MCLN Multimodal Counterfactual Learning Network for Multimedia-based Recommendation Filtration None End-to-end SIGIR'23 link
Enhancing Adversarial Robustness of Multi-modal Recommendation via Modality Balancing Filtration None End-to-end MM'23 N/A
MGCN MGCN: Multi-View Graph Convolutional Network for Multimedia Recommendation User-item Graph+Item-item Graph+Coarse-grained Attention None End-to-end MM'23 link
SGFD Semantic-Guided Feature Distillation for Multimodal Recommendation User-item Graph None Two-step MM'23 link
LATTICE A Tale of Two Graphs: Freezing and Denoising Graph Structures for Multimodal Recommendation Filtration None End-to-end MM'23 link
MMSR Adaptive Multi-Modalities Fusion in Sequential Recommendation Systems Item-item Graph + Combined-attention None End-to-end CIKM'23 link
M3Srec Multi-modal Mixture of Experts Representation Learning for Sequential Recommendation Other Fusion CL Two-step CIKM'23 link
MEGCF MEGCF: Multimodal Entity Graph Collaborative Filtering for Personalized Recommendation Filtration None End-to-end TOIS'23 link
SEM Disentangled Representation Learning for Recommendation Other Fusion DRL End-to-end TPAMI'23 N/A
PromptMM PromptMM: Multi-Modal Knowledge Distillation for Recommendation with Prompt-Tuning User-item Graph None Two-step WWW'24 link
MG Mirror Gradient: Towards Robust Multimodal Recommender Systems via Exploring Flat Local Minima Filtration None End-to-end WWW'24 link