Transformer系列代码

可解释性：https://github.com/hila-chefer/Transformer-Explainability

# CrossFormer: https://github.com/cheerss/CrossFormer

Backbone	Segmentation Head	Iterations	Params	FLOPs	IOU	MS IOU
CrossFormer-S	FPN	80K	34.3M	209.8G	46.4	-
CrossFormer-B	FPN	80K	55.6M	320.1G	48.0	-
CrossFormer-L	FPN	80K	95.4M	482.7G	49.1	-
ResNet-101	UPerNet	160K	86.0M	1029.G	44.9	-
CrossFormer-S	UPerNet	160K	62.3M	979.5G	47.6	48.4
CrossFormer-B	UPerNet	160K	83.6M	1089.7G	49.7	50.6
CrossFormer-L	UPerNet	160K	125.5M	1257.8G	50.4	51.4

Backbone	Method	Crop Size	Lr Schd	mIoU	mIoU (ms+flip)	#params	FLOPs
Swin-T	UPerNet	512x512	160K	44.51	45.81	60M	945G
Swin-S	UperNet	512x512	160K	47.64	49.47	81M	1038G
Swin-B	UperNet	512x512	160K	48.13	49.72	121M	1188G

Dataset	Backbone	Head nums	mAP(%)	Resolution	Download
VOC2007	ResNet-101	1	94.7	448x448	download (opens new window)
VOC2007	ResNet-cut	1	95.2	448x448	download (opens new window)
COCO	ResNet-101	4	83.3	448x448	download (opens new window)
COCO	ResNet-cut	6	85.6	448x448	download (opens new window)
Wider	VIT_B16_224	1	89.0	224x224	download (opens new window)
Wider	VIT_L16_224	1	90.2	224x224	download (opens new window)

https://github.com/huawei-noah/Pretrained-IPT

https://github.com/HRNet/HRFormer

Methods	Backbone	Window Size	Train Set	Test Set	Iterations	Batch Size	OHEM	mIoU	mIoU (Multi-Scale)	Log	ckpt	script
OCRNet	HRFormer-S	7x7	Train	Val	150000	8	Yes	44.0	45.1	log (opens new window)	ckpt (opens new window)	script (opens new window)
OCRNet	HRFormer-B	7x7	Train	Val	150000	8	Yes	46.3	47.6	log (opens new window)	ckpt (opens new window)	script (opens new window)
OCRNet	HRFormer-B	13x13	Train	Val	150000	8	Yes	48.7	50.0	log (opens new window)	ckpt (opens new window)	script (opens new window)
OCRNet	HRFormer-B	15x15	Train	Val	150000	8	Yes	-	-	-	-	-

https://github.com/facebookresearch/deit

We provide baseline DeiT models pretrained on ImageNet 2012.

name	acc@1	acc@5	#params	url
DeiT-tiny	72.2	91.1	5M	model (opens new window)
DeiT-small	79.9	95.0	22M	model (opens new window)
DeiT-base	81.8	95.6	86M	model (opens new window)
DeiT-tiny distilled	74.5	91.9	6M	model (opens new window)
DeiT-small distilled	81.2	95.4	22M	model (opens new window)
DeiT-base distilled	83.4	96.5	87M	model (opens new window)
DeiT-base 384	82.9	96.2	87M	model (opens new window)
DeiT-base distilled 384 (1000 epochs)	85.2	97.2	88M	model (opens new window)
CaiT-S24 distilled 384	85.1	97.3	47M	model (opens new window)
CaiT-M48 distilled 448	86.5	97.7	356M	model (opens new window)

https://arxiv.org/abs/2107.01378

https://arxiv.org/abs/2106.15941

Attention Map 的 rank 和多样性

https://github.com/fudan-zvg/SOFT

Model	Resolution	Params	FLOPs	Top-1 %	Config
SOFT-Tiny	224	13M	1.9G	79.3	SOFT_Tiny.yaml (opens new window), SOFT_Tiny_cuda.yaml (opens new window)
SOFT-Small	224	24M	3.3G	82.2	SOFT_Small.yaml (opens new window), SOFT_Small_cuda.yaml (opens new window)
SOFT-Medium	224	45M	7.2G	82.9	SOFT_Meidum.yaml (opens new window), SOFT_Meidum_cuda.yaml (opens new window)
SOFT-Large	224	64M	11.0G	83.1	SOFT_Large.yaml (opens new window), SOFT_Large_cuda.yaml (opens new window)
SOFT-Huge	224	87M	16.3G	83.3	SOFT_Huge.yaml (opens new window), SOFT_Huge_cuda.yaml (opens new window)

上次更新: 2023/03/25, 19:58:09