Transformer系列代码
可解释性:https://github.com/hila-chefer/Transformer-Explainability
# CrossFormer: https://github.com/cheerss/CrossFormer
# ADE20K
Backbone | Segmentation Head | Iterations | Params | FLOPs | IOU | MS IOU |
---|---|---|---|---|---|---|
CrossFormer-S | FPN | 80K | 34.3M | 209.8G | 46.4 | - |
CrossFormer-B | FPN | 80K | 55.6M | 320.1G | 48.0 | - |
CrossFormer-L | FPN | 80K | 95.4M | 482.7G | 49.1 | - |
ResNet-101 | UPerNet | 160K | 86.0M | 1029.G | 44.9 | - |
CrossFormer-S | UPerNet | 160K | 62.3M | 979.5G | 47.6 | 48.4 |
CrossFormer-B | UPerNet | 160K | 83.6M | 1089.7G | 49.7 | 50.6 |
CrossFormer-L | UPerNet | 160K | 125.5M | 1257.8G | 50.4 | 51.4 |
# Swin-Transformer: https://github.com/SwinTransformer/Swin-Transformer-Semantic-Segmentation
# ADE20K
Backbone | Method | Crop Size | Lr Schd | mIoU | mIoU (ms+flip) | #params | FLOPs |
---|---|---|---|---|---|---|---|
Swin-T | UPerNet | 512x512 | 160K | 44.51 | 45.81 | 60M | 945G |
Swin-S | UperNet | 512x512 | 160K | 47.64 | 49.47 | 81M | 1038G |
Swin-B | UperNet | 512x512 | 160K | 48.13 | 49.72 | 121M | 1188G |
# Residual Attention: A Simple but Effective Method for Multi-Label Recognition: https://github.com/Kevinz-code/CSRA
Dataset | Backbone | Head nums | mAP(%) | Resolution | Download |
---|---|---|---|---|---|
VOC2007 | ResNet-101 | 1 | 94.7 | 448x448 | download (opens new window) |
VOC2007 | ResNet-cut | 1 | 95.2 | 448x448 | download (opens new window) |
COCO | ResNet-101 | 4 | 83.3 | 448x448 | download (opens new window) |
COCO | ResNet-cut | 6 | 85.6 | 448x448 | download (opens new window) |
Wider | VIT_B16_224 | 1 | 89.0 | 224x224 | download (opens new window) |
Wider | VIT_L16_224 | 1 | 90.2 | 224x224 | download (opens new window) |
# CMT: Convolutional Neural Networks Meet Vision Transformers
# Pre-Trained Image Processing Transformer (IPT)
https://github.com/huawei-noah/Pretrained-IPT
# HRFormer: High-Resolution Transformer for Dense Prediction, NeurIPS 2021
https://github.com/HRNet/HRFormer
# ADE20K
Methods | Backbone | Window Size | Train Set | Test Set | Iterations | Batch Size | OHEM | mIoU | mIoU (Multi-Scale) | Log | ckpt | script |
---|---|---|---|---|---|---|---|---|---|---|---|---|
OCRNet | HRFormer-S | 7x7 | Train | Val | 150000 | 8 | Yes | 44.0 | 45.1 | log (opens new window) | ckpt (opens new window) | script (opens new window) |
OCRNet | HRFormer-B | 7x7 | Train | Val | 150000 | 8 | Yes | 46.3 | 47.6 | log (opens new window) | ckpt (opens new window) | script (opens new window) |
OCRNet | HRFormer-B | 13x13 | Train | Val | 150000 | 8 | Yes | 48.7 | 50.0 | log (opens new window) | ckpt (opens new window) | script (opens new window) |
OCRNet | HRFormer-B | 15x15 | Train | Val | 150000 | 8 | Yes | - | - | - | - | - |
# DeiT: Data-efficient Image Transformers
https://github.com/facebookresearch/deit
# Model Zoo
We provide baseline DeiT models pretrained on ImageNet 2012.
name | acc@1 | acc@5 | #params | url |
---|---|---|---|---|
DeiT-tiny | 72.2 | 91.1 | 5M | model (opens new window) |
DeiT-small | 79.9 | 95.0 | 22M | model (opens new window) |
DeiT-base | 81.8 | 95.6 | 86M | model (opens new window) |
DeiT-tiny distilled | 74.5 | 91.9 | 6M | model (opens new window) |
DeiT-small distilled | 81.2 | 95.4 | 22M | model (opens new window) |
DeiT-base distilled | 83.4 | 96.5 | 87M | model (opens new window) |
DeiT-base 384 | 82.9 | 96.2 | 87M | model (opens new window) |
DeiT-base distilled 384 (1000 epochs) | 85.2 | 97.2 | 88M | model (opens new window) |
CaiT-S24 distilled 384 | 85.1 | 97.3 | 47M | model (opens new window) |
CaiT-M48 distilled 448 | 86.5 | 97.7 | 356M | model (opens new window) |
# Efficient Vision Transformers via Fine-Grained Manifold Distillation
https://arxiv.org/abs/2107.01378
# Augmented Shortcuts for Vision Transformers
https://arxiv.org/abs/2106.15941
Attention Map 的 rank 和多样性
# SOFT: Softmax-free Transformer with Linear Complexity
https://github.com/fudan-zvg/SOFT
# Image Classification
# ImageNet-1K
Model | Resolution | Params | FLOPs | Top-1 % | Config |
---|---|---|---|---|---|
SOFT-Tiny | 224 | 13M | 1.9G | 79.3 | SOFT_Tiny.yaml (opens new window), SOFT_Tiny_cuda.yaml (opens new window) |
SOFT-Small | 224 | 24M | 3.3G | 82.2 | SOFT_Small.yaml (opens new window), SOFT_Small_cuda.yaml (opens new window) |
SOFT-Medium | 224 | 45M | 7.2G | 82.9 | SOFT_Meidum.yaml (opens new window), SOFT_Meidum_cuda.yaml (opens new window) |
SOFT-Large | 224 | 64M | 11.0G | 83.1 | SOFT_Large.yaml (opens new window), SOFT_Large_cuda.yaml (opens new window) |
SOFT-Huge | 224 | 87M | 16.3G | 83.3 | SOFT_Huge.yaml (opens new window), SOFT_Huge_cuda.yaml (opens new window) |
上次更新: 2023/03/25, 19:58:09
- 02
- README 美化05-20
- 03
- 常见 Tricks 代码片段05-12