Create a pull request or issue to add your works into this list.
For more recent updates, you can consider searching for datasets that include Vietnamese on HuggingFace here: https://huggingface.co/datasets?language=language:vi&sort=trending
VLSP 2016 Share Task: Sentiment Analysis
Train: 5100 sentences (1700 positive, 1700 neutral, 1700 negative).
Test: 1050 sentences (350 positive, 350 neutral, 350 negative).
| Model | F1 | Paper | Code |
|---|---|---|---|
| Perceptron/SVM/Maxent | 80.05 | DSKTLAB: Vietnamese Sentiment Analysis for Product Reviews | |
| SVM/MLNN/LSTM | 71.44 | A Simple Supervised Learning Approach to Sentiment Classification at VLSP 2016 | |
| Ensemble: Random forest, SVM, Naive Bayes | 71.22 | A Lightweight Ensemble Method for Sentiment Classification Task | |
| Ensemble: SVM, LR, LSTM, CNN | 69.71 | An Ensemble of Shallow and Deep Learning Algorithms for Vietnamese Sentiment Analysis | |
| SVM | 67.54 | Sentiment Analysis for Vietnamese using Support Vector Machines with application to Facebook comments | |
| SVM/MLNN | 67.23 | A Multi-layer Neural Network-based System for Vietnamese Sentiment Analysis at the VLSP 2016 Evaluation Campaign | |
| Multi-channel LSTM-CNN | 59.61 | Multi-channel LSTM-CNN model for Vietnamese sentiment analysis | official |
VLSP 2018 Shared Task: Aspect Based Sentiment Analysis
Restaurant Dataset: 2961 reviews (train), 1290 reviews (development), 500 reviews (test).
| Model | Aspect (F1) | Aspect Polarity (F1) | Paper | Code |
|---|---|---|---|---|
| CNN | 0.80 | Deep Learning for Aspect Detection on Vietnamese Reviews | ||
| SVM | 0.77 | 0.61 | NLP@UIT at VLSP 2018: A Supervised Method For Aspect Based Sentiment Analysis | |
| SVM | 0.54 | 0.48 | Using Multilayer Perceptron for Aspect-based Sentiment Analysis at VLSP 2018 SA Task |
Hotel Dataset: 3000 reviews (training), 2000 reviews (development), 600 reviews (test).
| Model | Aspect (F1) | Aspect Polarity (F1) | Paper | Code |
|---|---|---|---|---|
| SVM | 0.70 | 0.61 | NLP@UIT at VLSP 2018: A Supervised Method For Aspect Based Sentiment Analysis | |
| CNN | 0.69 | Deep Learning for Aspect Detection on Vietnamese Reviews | ||
| SVM | 0.56 | 0.53 | Using Multilayer Perceptron for Aspect-based Sentiment Analysis at VLSP 2018 SA Task |
Vietnamese Student's Feedback Corpus (UIT-VSFC)
UIT-VSFC consists of over 16,000 sentences for sentiment analysis and topic classification.
| Model | Sentiment (F1) | Topic (F1) | Paper | Code |
|---|---|---|---|---|
| Bi-LSTM/Word2Vec | 0.896 | 0.92 | Deep Learning versus Traditional Classifiers on Vietnamese Student’s Feedback Corpus | |
| Maximum Entropy Classifier | 0.88 | 0.84 | UIT-VSFC: Vietnamese Student’s Feedback Corpus for Sentiment Analysis |
VLSP 2016 Shared Task: Named Entity Recognition
| Model | F1 | Paper | Code |
|---|---|---|---|
| PhoBERT_large | 94.7 | PhoBERT: Pre-trained language models for Vietnamese | official |
| vELECTRA + BiLSTM + Attention | 94.07 | Improving Sequence Tagging for Vietnamese Text Using Transformer-based Neural Models | |
| PhoBERT_base | 93.6 | PhoBERT: Pre-trained language models for Vietnamese | official |
| XLM-R | 92.0 | PhoBERT: Pre-trained language models for Vietnamese | |
| VnCoreNLP-NER + ETNLP | 91.3 | ETNLP: A visual-aided systematic approach to select pre-trained embeddings for a downstream task | |
| BiLSTM-CNN-CRF + ETNLP | 91.1 | ETNLP: A visual-aided systematic approach to select pre-trained embeddings for a downstream task | |
| VNER: Attentive Neural Network | 89.6 | Attentive Neural Network for Named Entity Recognition in Vietnamese | |
| BiLSTM-CNN-CRF | 88.3 | VnCoreNLP: A Vietnamese Natural Language Processing Toolkit | official |
| LSTM + CRF | 66.07 | An investigation of Vietnamese Nested Entity Recognition Models |
VLSP 2018 Shared Task: Named Entity Recognition
| Model | F1 | Paper | Code |
|---|---|---|---|
| vELECTRA + BiGRU | 90.31 | Improving Sequence Tagging for Vietnamese Text Using Transformer-based Neural Models | |
| VIETNER: CRF (ngrams + word shapes + cluster + w2v) | 76.63 | A Feature-Based Model for Nested Named-Entity RecognitionatVLSP-2018 NER Evaluation Campaign | |
| ZA-NER | 74.70 | ZA-NER: Vietnamese Named Entity Recognition at VLSP 2018 Evaluation Campaign |