Muyun99's wiki Muyun99's wiki
首页
学术搬砖
学习笔记
生活杂谈
wiki搬运
资源收藏
关于
  • 分类
  • 标签
  • 归档
GitHub (opens new window)

Muyun99

努力成为一个善良的人
首页
学术搬砖
学习笔记
生活杂谈
wiki搬运
资源收藏
关于
  • 分类
  • 标签
  • 归档
GitHub (opens new window)
  • 论文摘抄

  • 论文阅读-图像分类

  • 论文阅读-语义分割

  • 论文阅读-知识蒸馏

  • 论文阅读-Transformer

  • 论文阅读-图卷积网络

  • 论文阅读-弱监督图像分割

  • 论文阅读-半监督图像分割

  • 论文阅读-带噪学习

  • 论文阅读-小样本学习

  • 论文阅读-自监督学习

    • (MoCov1) Momentum Contrast for Unsupervised Visual Representation Learning
    • (SimCLRv1) A Simple Framework for Contrastive Learning of Visual Representations
    • (SimCLRv2) Big Self-Supervised Models are Strong Semi-Supervised Learners
    • (InstDis) Unsupervised Feature Learning via Non-Parametric Instance Discrimination
    • (CPC) Representation Learning with Contrastive Predictive Coding
    • (CMC) Contrastive Multiview Coding, also contains implementations for MoCo and InstDis
    • (HPT) Self-Supervised Pretraining Improves Self-Supervised Pretraining
    • (SimSiam) SimSiam Exploring Simple Siamese Representation Learning
    • 自监督系列代码
      • 01、MoCo: Momentum Contrast for Unsupervised Visual Representation Learning
      • 02、SimCLR - A Simple Framework for Contrastive Learning of Visual Representations
        • Pre-trained models for SimCLRv1
        • Pre-trained models for SimCLRv2
      • 03、SimSiam: Exploring Simple Siamese Representation Learning
        • Models and Logs
      • 04、Understanding Dimensional Collapse in Contrastive Self-supervised Learning
      • 05、Improving Contrastive Learning by Visualizing Feature Transformation
      • 06、Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning
        • 6.1 Pascal VOC object detection
        • 6.2 COCO object detection
      • 07、CVPR2021 | Online Bag-of-Visual-Words Generation for Unsupervised Representation Learning
      • 08、NeurIPS 2020 | Unsupervised Learning of Visual Features by Contrasting Cluster Assignments
      • 09、ECCV 2020 | Learning to Classify Images without Labels
      • 10、ICML 2020 | Self-Supervised Prototypical Transfer Learning for Few-Shot Classification
      • 11、NeurIPS 2020 | Bootstrap Your Own Latent
      • 12、Efficient Self-Supervised Vision Transformers
      • 13、Emerging Properties in Self-Supervised Vision Transformers.
  • 语义分割中的知识蒸馏

  • 学术文章搜集

  • 论文阅读-其他文章

  • 学术搬砖
  • 论文阅读-自监督学习
Muyun99
2021-10-21

自监督系列代码

# 01、MoCo: Momentum Contrast for Unsupervised Visual Representation Learning

# 1.1 MoCo v1 & MoCo v2

https://github.com/facebookresearch/moco

Models

Our pre-trained ResNet-50 models can be downloaded as following:

epochs mlp aug+ cos top-1 acc. model md5
MoCo v1 (opens new window) 200 60.6 download (opens new window) b251726a
MoCo v2 (opens new window) 200 ✓ ✓ ✓ 67.7 download (opens new window) 59fd9945
MoCo v2 (opens new window) 800 ✓ ✓ ✓ 71.1 download (opens new window) a04e12f8

# 1.2 MoCov3

https://github.com/facebookresearch/moco-v3

ResNet-50, linear classification

pretrain epochs pretrain crops linear acc
100 2x224 68.9
300 2x224 72.8
1000 2x224 74.6

ViT, linear classification

model pretrain epochs pretrain crops linear acc
ViT-Small 300 2x224 73.2
ViT-Base 300 2x224 76.7

ViT, end-to-end fine-tuning

model pretrain epochs pretrain crops e2e acc
ViT-Small 300 2x224 81.4
ViT-Base 300 2x224 83.2

# 02、SimCLR - A Simple Framework for Contrastive Learning of Visual Representations

https://github.com/google-research/simclr

# Pre-trained models for SimCLRv1

The pre-trained models (base network with linear classifier layer) can be found below. Note that for these SimCLRv1 checkpoints, the projection head is not available.

Model checkpoint and hub-module ImageNet Top-1
ResNet50 (1x) (opens new window) 69.1
ResNet50 (2x) (opens new window) 74.2
ResNet50 (4x) (opens new window) 76.6

# Pre-trained models for SimCLRv2

Depth Width SK Param (M) F-T (1%) F-T(10%) F-T(100%) Linear eval Supervised
50 1X False 24 57.9 68.4 76.3 71.7 76.6
50 1X True 35 64.5 72.1 78.7 74.6 78.5
50 2X False 94 66.3 73.9 79.1 75.6 77.8
50 2X True 140 70.6 77.0 81.3 77.7 79.3
101 1X False 43 62.1 71.4 78.2 73.6 78.0
101 1X True 65 68.3 75.1 80.6 76.3 79.6
101 2X False 170 69.1 75.8 80.7 77.0 78.9
101 2X True 257 73.2 78.8 82.4 79.0 80.1
152 1X False 58 64.0 73.0 79.3 74.5 78.3
152 1X True 89 70.0 76.5 81.3 77.2 79.9
152 2X False 233 70.2 76.6 81.1 77.4 79.1
152 2X True 354 74.2 79.4 82.9 79.4 80.4
152 3X True 795 74.9 80.1 83.1 79.8 80.5

# 03、SimSiam: Exploring Simple Siamese Representation Learning

https://github.com/facebookresearch/simsiam

# Models and Logs

Our pre-trained ResNet-50 models and logs:

pre-train epochs batch size pre-train ckpt pre-train log linear cls. ckpt linear cls. log top-1 acc.
100 512 link (opens new window) link (opens new window) link (opens new window) link (opens new window) 68.1
100 256 link (opens new window) link (opens new window) link (opens new window) link (opens new window) 68.3

# 04、Understanding Dimensional Collapse in Contrastive Self-supervised Learning

# 05、Improving Contrastive Learning by Visualizing Feature Transformation

https://github.com/DTennant/CL-Visualizing-Feature-Transformation

Models

For your convenience, we provide the following pre-trained models on ImageNet-1K and ImageNet-100.

pre-train method pre-train dataset backbone #epoch ImageNet-1K VOC det AP50 COCO det AP Link
Supervised ImageNet-1K ResNet-50 - 76.1 81.3 38.2 download (opens new window)
MoCo-v1 ImageNet-1K ResNet-50 200 60.6 81.5 38.5 download (opens new window)
MoCo-v1+FT ImageNet-1K ResNet-50 200 61.9 82.0 39.0 download (opens new window)
MoCo-v2 ImageNet-1K ResNet-50 200 67.5 82.4 39.0 download (opens new window)
MoCo-v2+FT ImageNet-1K ResNet-50 200 69.6 83.3 39.5 download (opens new window)
MoCo-v1+FT ImageNet-100 ResNet-50 200 IN-100 result 77.2 - - download (opens new window)

# 06、Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

# 6.1 Pascal VOC object detection

# Faster-RCNN with C4

Method Epochs Arch AP AP50 AP75 Download
Scratch - ResNet-50 33.8 60.2 33.1 -
Supervised 100 ResNet-50 53.5 81.3 58.8 -
MoCo 200 ResNet-50 55.9 81.5 62.6 -
SimCLR 1000 ResNet-50 56.3 81.9 62.5 -
MoCo v2 800 ResNet-50 57.6 82.7 64.4 -
InfoMin 200 ResNet-50 57.6 82.7 64.6 -
InfoMin 800 ResNet-50 57.5 82.5 64.0 -
PixPro (ours) (opens new window) 100 ResNet-50 58.8 83.0 66.5 config (opens new window) | model (opens new window)
PixPro (ours) (opens new window) 400 ResNet-50 60.2 83.8 67.7 config (opens new window) | model (opens new window)

# 6.2 COCO object detection

# Mask-RCNN with FPN

Method Epochs Arch Schedule bbox AP mask AP Download
Scratch - ResNet-50 1x 32.8 29.9 -
Supervised 100 ResNet-50 1x 39.7 35.9 -
MoCo 200 ResNet-50 1x 39.4 35.6 -
SimCLR 1000 ResNet-50 1x 39.8 35.9 -
MoCo v2 800 ResNet-50 1x 40.4 36.4 -
InfoMin 200 ResNet-50 1x 40.6 36.7 -
InfoMin 800 ResNet-50 1x 40.4 36.6 -
PixPro (ours) (opens new window) 100 ResNet-50 1x 40.8 36.8 config (opens new window) | model (opens new window)
PixPro (ours) 100* ResNet-50 1x 41.3 37.1 -
PixPro (ours) 400* ResNet-50 1x 41.4 37.4 -

* Indicates methods with instance branch.

# Mask-RCNN with C4

Method Epochs Arch Schedule bbox AP mask AP Download
Scratch - ResNet-50 1x 26.4 29.3 -
Supervised 100 ResNet-50 1x 38.2 33.3 -
MoCo 200 ResNet-50 1x 38.5 33.6 -
SimCLR 1000 ResNet-50 1x 38.4 33.6 -
MoCo v2 800 ResNet-50 1x 39.5 34.5 -
InfoMin 200 ResNet-50 1x 39.0 34.1 -
InfoMin 800 ResNet-50 1x 38.8 33.8 -
PixPro (ours) (opens new window) 100 ResNet-50 1x 40.0 34.8 config (opens new window) | model (opens new window)
PixPro (ours) (opens new window) 400 ResNet-50 1x 40.5 35.3 config (opens new window) | model (opens new window)

# 07、CVPR2021 | Online Bag-of-Visual-Words Generation for Unsupervised Representation Learning

https://github.com/valeoai/obow

# 7.1 ResNet50 pre-trained model

Method Epochs Batch-size Dataset ImageNet linear acc. Links to pre-trained weights
OBoW 200 256 ImageNet 73.8 entire model (opens new window) / only feature extractor (opens new window)

# 08、NeurIPS 2020 | Unsupervised Learning of Visual Features by Contrasting Cluster Assignments

https://github.com/facebookresearch/swav

method epochs batch-size multi-crop ImageNet top-1 acc. url args
SwAV 800 4096 2x224 + 6x96 75.3 model (opens new window) script (opens new window)
SwAV 400 4096 2x224 + 6x96 74.6 model (opens new window) script (opens new window)
SwAV 200 4096 2x224 + 6x96 73.9 model (opens new window) script (opens new window)
SwAV 100 4096 2x224 + 6x96 72.1 model (opens new window) script (opens new window)
SwAV 200 256 2x224 + 6x96 72.7 model (opens new window) script (opens new window)
SwAV 400 256 2x224 + 6x96 74.3 model (opens new window) script (opens new window)
SwAV 400 4096 2x224 70.1 model (opens new window) script (opens new window)
DeepCluster-v2 800 4096 2x224 + 6x96 75.2 model (opens new window) script (opens new window)
DeepCluster-v2 400 4096 2x160 + 4x96 74.3 model (opens new window) script (opens new window)
DeepCluster-v2 400 4096 2x224 70.2 model (opens new window) script (opens new window)
SeLa-v2 400 4096 2x160 + 4x96 71.8 model (opens new window) -
SeLa-v2 400 4096 2x224 67.2 model (opens new window) -

# 09、ECCV 2020 | Learning to Classify Images without Labels

https://github.com/wvangansbeke/Unsupervised-Classification

We also train SCAN on ImageNet for 1000 clusters. We use 10 clusterheads and finally take the head with the lowest loss. The accuracy (ACC), normalized mutual information (NMI), adjusted mutual information (AMI) and adjusted rand index (ARI) are computed:

Method ACC NMI AMI ARI Download link
SCAN (ResNet50) 39.9 72.0 51.2 27.5 Download (opens new window)

# 10、ICML 2020 | Self-Supervised Prototypical Transfer Learning for Few-Shot Classification

https://github.com/indy-lab/ProtoTransfer

# 11、NeurIPS 2020 | Bootstrap Your Own Latent

https://github.com/deepmind/deepmind-research/tree/master/byol

Using this implementation should achieve a top-1 accuracy on Imagenet between 74.0% and 74.5% after about 8h of training using 512 Cloud TPU v3.

# 12、Efficient Self-Supervised Vision Transformers

https://github.com/microsoft/esvit

# 12.1 Pretrained models

You can download the full checkpoint (trained with both view-level and region-level tasks, batch size=512 and ImageNet-1K.), which contains backbone and projection head weights for both student and teacher networks.

  • EsViT (Swin) with network configurations of increased model capacities, pre-trained with both view-level and region-level tasks. ResNet-50 trained with both tasks is shown as a reference.
arch params linear k-nn download logs
ResNet-50 23M 75.7% 71.3% full ckpt (opens new window) train (opens new window) linear (opens new window) knn (opens new window)
EsViT (Swin-T, W=7) 28M 78.0% 75.7% full ckpt (opens new window) train (opens new window) linear (opens new window) knn (opens new window)
EsViT (Swin-S, W=7) 49M 79.5% 77.7% full ckpt (opens new window) train (opens new window) linear (opens new window) knn (opens new window)
EsViT (Swin-B, W=7) 87M 80.4% 78.9% full ckpt (opens new window) train (opens new window) linear (opens new window) knn (opens new window)
EsViT (Swin-T, W=14) 28M 78.7% 77.0% full ckpt (opens new window) train (opens new window) linear (opens new window) knn (opens new window)
EsViT (Swin-S, W=14) 49M 80.8% 79.1% full ckpt (opens new window) train (opens new window) linear (opens new window) knn (opens new window)
EsViT (Swin-B, W=14) 87M 81.3% 79.3% full ckpt (opens new window) train (opens new window) linear (opens new window) knn (opens new window)

# 13、Emerging Properties in Self-Supervised Vision Transformers.

https://github.com/facebookresearch/dino

# 13.1 Pretrained models

You can choose to download only the weights of the pretrained backbone used for downstream tasks, or the full checkpoint which contains backbone and projection head weights for both student and teacher networks. We also provide the backbone in onnx format, as well as detailed arguments and training/evaluation logs. Note that DeiT-S and ViT-S names refer exactly to the same architecture.

arch params k-nn linear download
ViT-S/16 21M 74.5% 77.0% backbone only (opens new window) full ckpt (opens new window) onnx (opens new window) args (opens new window) logs (opens new window) eval logs (opens new window)
ViT-S/8 21M 78.3% 79.7% backbone only (opens new window) full ckpt (opens new window) onnx (opens new window) args (opens new window) logs (opens new window) eval logs (opens new window)
ViT-B/16 85M 76.1% 78.2% backbone only (opens new window) full ckpt (opens new window) onnx (opens new window) args (opens new window) logs (opens new window) eval logs (opens new window)
ViT-B/8 85M 77.4% 80.1% backbone only (opens new window) full ckpt (opens new window) onnx (opens new window) args (opens new window) logs (opens new window) eval logs (opens new window)
ResNet-50 23M 67.5% 75.3% backbone only (opens new window) full ckpt (opens new window) onnx (opens new window) args (opens new window) logs (opens new window) eval logs (opens new window)

We also release XCiT models ([arXiv (opens new window)] [code (opens new window)]) trained with DINO:

arch params k-nn linear download
xcit_small_12_p16 26M 76.0% 77.8% backbone only (opens new window) full ckpt (opens new window) args (opens new window) logs (opens new window) eval (opens new window)
xcit_small_12_p8 26M 77.1% 79.2% backbone only (opens new window) full ckpt (opens new window) args (opens new window) logs (opens new window) eval (opens new window)
xcit_medium_24_p16 84M 76.4% 78.8% backbone only (opens new window) full ckpt (opens new window) args (opens new window) logs (opens new window) eval (opens new window)
xcit_medium_24_p8 84M 77.9% 80.3% backbone only (opens new window) full ckpt (opens new window) args (opens new window) logs (opens new window) eval (opens new window)
上次更新: 2021/11/03, 23:35:28
(SimSiam) SimSiam Exploring Simple Siamese Representation Learning
Structured Knowledge Distillation for Semantic Segmentation

← (SimSiam) SimSiam Exploring Simple Siamese Representation Learning Structured Knowledge Distillation for Semantic Segmentation→

最近更新
01
Structured Knowledge Distillation for Semantic Segmentation
06-03
02
README 美化
05-20
03
常见 Tricks 代码片段
05-12
更多文章>
Theme by Vdoing | Copyright © 2021-2023 Muyun99 | MIT License
  • 跟随系统
  • 浅色模式
  • 深色模式
  • 阅读模式
×