GitHub - AMD-AGI/AMD_IFFN: Enhancing Vision Transformer: Amplifying Non-Linearity in Feedforward Network Module (ICML 2024)

Enhancing Vision Transformer: Amplifying Non-Linearity in Feedforward Network Module (ICML 2024)

Yixing Xu, Chao Li, Dong Li, Xiao Sheng, Fan Jiang, Lu Tian, Ashish Sirasao, Emad Barsoum | Paper

Advanced Micro Devices, Inc.

Dependancies

torch == 1.13.0
torchvision == 0.14.0
timm == 0.6.12
einops == 0.6.1

Model performance

The image classification results of our models on ImageNet dataset are shown in the following table.

Model	Parameters (M)	FLOPs(G)	Top-1 Accuracy (%)
DeiT-Ti	5.72	1.26	72.2
+IFFN (ours)	5.00	1.10	72.6
DeiT-S	22.05	4.60	79.9
+IFFN (ours)	18.84	3.93	80.0
DeiT-B	86.57	17.57	81.8
+IFFN (ours)	73.66	14.92	81.8

Model Evaluation

python main.py --model deit_base_patch16_224 --data-path /path/to/imagenet/ --resume /path/to/base_model/ --eval

Model Training

DeiT-Ti+IFFN

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model deit_tiny_patch16_224 --batch-size 256 --epochs 300 --data-path /path/to/imagenet/ --output_dir ./output/iffn_ti/

DeiT-S+IFFN

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model deit_small_patch16_224 --batch-size 256 --epochs 300 --data-path /path/to/imagenet/ --output_dir ./output/iffn_s/```

DeiT-B+IFFN

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --model deit_base_patch16_224 --batch-size 256 --epochs 300 --data-path /path/to/imagenet/ --output_dir ./output/iffn_b/```

Citation

@inproceedings{xuenhancing,
  title={Enhancing Vision Transformer: Amplifying Non-Linearity in Feedforward Network Module},
  author={Xu, Yixing and Li, Chao and Li, Dong and Sheng, Xiao and Jiang, Fan and Tian, Lu and Sirasao, Ashish and Barsoum, Emad},
  booktitle={Forty-first International Conference on Machine Learning}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github		.github
img		img
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
License		License
SECURITY.md		SECURITY.md
augment.py		augment.py
datasets.py		datasets.py
engine.py		engine.py
losses.py		losses.py
main.py		main.py
models.py		models.py
readme.md		readme.md
samplers.py		samplers.py
utils.py		utils.py
vision_transformer.py		vision_transformer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Enhancing Vision Transformer: Amplifying Non-Linearity in Feedforward Network Module (ICML 2024)

Dependancies

Model performance

Model Evaluation

Model Training

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Enhancing Vision Transformer: Amplifying Non-Linearity in Feedforward Network Module (ICML 2024)

Dependancies

Model performance

Model Evaluation

Model Training

Citation

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages