The Potential of Cognitive-Inspired Neural Network Modeling Framework for Computer Vision

Guorun Li; Lei Liu; Xiaoyu Li; Yuefeng Du*;

Accepted by [Advanced Science](https://advanced.onlinelibrary.wiley.com/doi/10.1002/advs.202507730) in 2025

🏠 TODOs

training and validation code.
Agri170K dataset (val).
Agri170K dataset (train-val).
Release the checkpoints.

🏠 Abstract

Vision deep neural networks (VDNNs) only simulate the attention-based significance selection function in human visual perception, rather than the full spectrum of visual cognition, reflecting the divide between cognitive science (CS) and artificial intelligence (AI). To address this problem, we propose a cognitive modeling framework (CMF) comprising three stages: functional abstraction, operator structuring, and program agent. Then, we define the prior information of basic image features as the long-term memory content in VDNNs. Meanwhile, we introduce a memory modeling method for VDNNs based on the fast Fourier transform (FFT) and statistical methods—the unbiased mapping algorithm (UMA). Finally, we develop visual cognitive neural units (VCNUs) and a baseline model (VCogM) based on CMF and UMA, and conduct performance testing experiments on different datasets such as natural scene recognition and agricultural image classification. The results show that VCogM and VCNU achieved state-of-the-art (SOTA) performance across various recognition tasks. The model’s learning process is independent of data distribution and scale, fully demonstrating the rationality of cognitive-inspired modeling principles. The research findings provide new insights into the deep integration of CS and AI.

🏠 Overview

🎁 Train and Test

We have provided detailed instructions for model training and testing and experimental details.

Install

Clone this repo:

conda create -n vdnn python=3.10 -y
conda activate vdnn
git clone https://github.com/CAU-COE-VEICLab/Vision-Cognitive-Neural-Networks.git
cd Vision-Cognitive-Neural-Networks

Install CUDA>=10.2 with cudnn>=7 following the official installation instructions
Install PyTorch>=1.8.0 and torchvision>=0.9.0 with CUDA>=10.2:

pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install timm==0.4.12
pip install opencv-python==4.4.0.46 termcolor==1.1.0 yacs==0.1.8 pyyaml scipy

Data preparation

We use standard ImageNet dataset, you can download it from http://image-net.org/. We provide the following two ways to load data:

For standard folder dataset, move validation images to labeled sub-folders. The file structure should look like:

$ tree data
imagenet
├── train
│   ├── class1
│   │   ├── img1.jpeg
│   │   ├── img2.jpeg
│   │   └── ...
│   ├── class2
│   │   ├── img3.jpeg
│   │   └── ...
│   └── ...
└── val
    ├── class1
    │   ├── img4.jpeg
    │   ├── img5.jpeg
    │   └── ...
    ├── class2
    │   ├── img6.jpeg
    │   └── ...
    └── ...

Evaluation

To evaluate a pre-trained VCogM on ImageNet val, run:

python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use>  main.py --eval \
--cfg <config-file, e.g.,  configs/sota_benchmark/vcnn/vcm_tiny_1k.yaml > --pretrained <checkpoint> --data-path <imagenet-path>

To evaluate a pre-trained VCogM on Agri170K val, run:

python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use>  main_diffusion_tuning.py --eval \
--cfg <config-file, e.g., configs/vcnu_agri17k/vcnn/pretrain/vcm_tiny_agri17k.yaml> --pretrained <checkpoint> --data-path <imagenet-path>

Training from scratch

To train the VCogM-48M on ImageNet1k, run:

python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use>  main.py \
--cfg <config-file, e.g.,  configs/sota_benchmark/vcnn/vcm_small_1k.yaml > --data-path <imagenet-path> [--batch-size <batch-size-per-gpu> --output <output-directory> --tag <job-tag>]

To train the VCogM-25M on Agri170K, run:

python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use>  main.py \
--cfg <config-file, e.g.,  configs/vcnu_agri17k/vcnn/pretrain/vcm_tiny_agri17k.yaml > --data-path <imagenet-path> [--batch-size <batch-size-per-gpu> --output <output-directory> --tag <job-tag>]

To train the VCNU-21M on Agri170K, run:

python -m torch.distributed.launch --nproc_per_node <num-of-gpus-to-use>  main.py \
--cfg <config-file, e.g.,  configs/vcnu_agri17k/vcnn/pretrain/vcnu_small_agri17k.yaml > --data-path <imagenet-path> [--batch-size <batch-size-per-gpu> --output <output-directory> --tag <job-tag>]

Using UMA for your dataset

You can calculate the SSIM value of each image in your dataset by following this step:

Using 'uma_tools/statistic_uma_strategy1.py' & 'uma_tools/statistic_uma_strategy2.py' to calculate the SSIM value of each image in the dataset. Then you can get the Excel file (named ssim_origin_excel_file), which contains the SSIM value of each image in your dataset
Using 'uma_tools/count_frequency.py' to calculate the frequency distribution P in your dataset.

Model Hub

TODOs

name	pretrain	resolution	acc@1	#params	FLOPs	1K model	Agri170K model
VCNU-T	ImageNet-1K	224x224	78.2	13M	2.3G	baidu	-
VCNU-S	ImageNet-1K	224x224	80.8	21M	4G	baidu	baidu
VCNU-B	ImageNet-1K	224x224	81.8	37M	6.8G	baidu	-
VCogM-T	ImageNet-1K	224x224	82.5	25M	4.3G	baidu	baidu
VCogM-S	ImageNet-1K	224x224	83.9	48M	8.7G	baidu	-
VCogM-B	ImageNet-1K	224x224	84.4	92M	17.1G	baidu	-

Agri170K dataset

We constructed a large-scale agricultural image dataset-Agri170K, comprising 96 categories and 173691 high-quality annotated images.

These images cover various scenes, including fruits, animals, crops, and agricultural machinery.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.idea		.idea
configs		configs
data		data
figure		figure
models		models
uma_tools		uma_tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
logger.py		logger.py
lr_scheduler.py		lr_scheduler.py
main.py		main.py
optimizer.py		optimizer.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Potential of Cognitive-Inspired Neural Network Modeling Framework for Computer Vision

Accepted by [Advanced Science](https://advanced.onlinelibrary.wiley.com/doi/10.1002/advs.202507730) in 2025

🏠 TODOs

🏠 Abstract

🏠 Overview

🎁 Train and Test

Install

Data preparation

Evaluation

Training from scratch

Using UMA for your dataset

Model Hub

Agri170K dataset

Acknowledge

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

The Potential of Cognitive-Inspired Neural Network Modeling Framework for Computer Vision

Accepted by [Advanced Science](https://advanced.onlinelibrary.wiley.com/doi/10.1002/advs.202507730) in 2025

🏠 TODOs

🏠 Abstract

🏠 Overview

🎁 Train and Test

Install

Data preparation

Evaluation

Training from scratch

Using UMA for your dataset

Model Hub

Agri170K dataset

Acknowledge

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages